All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/9] ZBC / Zoned block device support
@ 2016-09-19 21:27 ` Damien Le Moal
  0 siblings, 0 replies; 36+ messages in thread
From: Damien Le Moal @ 2016-09-19 21:27 UTC (permalink / raw)
  To: linux-scsi, linux-block
  Cc: martin.petersen, axboe, hare, shaun.tancheff, Damien Le Moal

This series introduces support for ZBC zoned block devices. It integrates
earlier submissions by Hannes Reinecke and Shaun Tancheff and includes
rewrites and corrections suggested by Christoph Hellwig.

For zoned block devices, a zone information cache implemented as an RB-tree
is attached to the device request queue and maintained from the SCSI disk
driver layer. The cache is used to check read and write commands alignement
to zone position and to the write pointer position within zones.
The generic block layer API defines functions for obtaining zone information
and manipulating zones. Operation on zones are defined as request operations
which triger zone cache changes when processed in the sd driver.

Most of the ZBC specific code is kept out of sd.c and implemented in the
new file sd_zbc.c. Similarly, at the block layer, most of the zoned block
device code is implemented in the new blk-zoned.c.

For host-managed zoned block devices, the sequential write constraint of
write pointer zones is exposed to the user. Users of the disk (applications,
file systems or device mappers) must sequentially write to zones. This means
that for raw block device accesses from applications, buffered writes are
unreliable and direct I/Os must be used (or buffered writes with O_SYNC).
At the SCSI layer, write ordering is maintained at dispatch time for both
the simple queue model and scsi-mq model using a zone write lock. This lock,
implemented as a flag in zone information, prevents issuing multiple writes
to a single zone, in effect, resulting in write queue depth of 1 per zone
while allowing the drive to be operated at high queue depth overall.

Access to zone manipulation operations is also provided to applications
through a set of new ioctls. This allows applications operating on raw
block devices (e.g. mkfs.xxx) to discover a device zone layout and
manipulate zone state.

Damien Le Moal (1):
  block: Add 'zoned' queue limit

Hannes Reinecke (6):
  blk-sysfs: Add 'chunk_sectors' to sysfs attributes
  block: update chunk_sectors in blk_stack_limits()
  block: Implement support for zoned block devices
  block: Add 'BLKPREP_DONE' return value
  block: Add 'BLK_MQ_RQ_QUEUE_DONE' return value
  sd: Implement support for ZBC devices

Shaun Tancheff (2):
  block: Define zoned block device operations
  blk-zoned: Add ioctl interface for zone operations

 block/Kconfig                 |    8 +
 block/Makefile                |    1 +
 block/blk-core.c              |   53 +-
 block/blk-merge.c             |   31 +-
 block/blk-mq.c                |    1 +
 block/blk-settings.c          |    5 +
 block/blk-sysfs.c             |   29 ++
 block/blk-zoned.c             |  453 +++++++++++++++++
 block/ioctl.c                 |    8 +
 drivers/scsi/Makefile         |    1 +
 drivers/scsi/scsi_lib.c       |    4 +
 drivers/scsi/sd.c             |  147 +++++-
 drivers/scsi/sd.h             |   68 +++
 drivers/scsi/sd_zbc.c         | 1097 +++++++++++++++++++++++++++++++++++++++++
 include/linux/bio.h           |   36 +-
 include/linux/blk-mq.h        |    1 +
 include/linux/blk_types.h     |   27 +-
 include/linux/blkdev.h        |  146 ++++++
 include/scsi/scsi_proto.h     |   17 +
 include/uapi/linux/Kbuild     |    1 +
 include/uapi/linux/blkzoned.h |   91 ++++
 include/uapi/linux/fs.h       |    1 +
 22 files changed, 2170 insertions(+), 56 deletions(-)
 create mode 100644 block/blk-zoned.c
 create mode 100644 drivers/scsi/sd_zbc.c
 create mode 100644 include/uapi/linux/blkzoned.h

-- 
2.7.4

Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice & Disclaimer:

This e-mail and any files transmitted with it may contain confidential or legally privileged information of WDC and/or its affiliates, and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 0/9] ZBC / Zoned block device support
@ 2016-09-19 21:27 ` Damien Le Moal
  0 siblings, 0 replies; 36+ messages in thread
From: Damien Le Moal @ 2016-09-19 21:27 UTC (permalink / raw)
  To: linux-scsi, linux-block
  Cc: martin.petersen, axboe, hare, shaun.tancheff, Damien Le Moal

This series introduces support for ZBC zoned block devices. It integrates
earlier submissions by Hannes Reinecke and Shaun Tancheff and includes
rewrites and corrections suggested by Christoph Hellwig.

For zoned block devices, a zone information cache implemented as an RB-tree
is attached to the device request queue and maintained from the SCSI disk
driver layer. The cache is used to check read and write commands alignement
to zone position and to the write pointer position within zones.
The generic block layer API defines functions for obtaining zone information
and manipulating zones. Operation on zones are defined as request operations
which triger zone cache changes when processed in the sd driver.

Most of the ZBC specific code is kept out of sd.c and implemented in the
new file sd_zbc.c. Similarly, at the block layer, most of the zoned block
device code is implemented in the new blk-zoned.c.

For host-managed zoned block devices, the sequential write constraint of
write pointer zones is exposed to the user. Users of the disk (applications,
file systems or device mappers) must sequentially write to zones. This means
that for raw block device accesses from applications, buffered writes are
unreliable and direct I/Os must be used (or buffered writes with O_SYNC).
At the SCSI layer, write ordering is maintained at dispatch time for both
the simple queue model and scsi-mq model using a zone write lock. This lock,
implemented as a flag in zone information, prevents issuing multiple writes
to a single zone, in effect, resulting in write queue depth of 1 per zone
while allowing the drive to be operated at high queue depth overall.

Access to zone manipulation operations is also provided to applications
through a set of new ioctls. This allows applications operating on raw
block devices (e.g. mkfs.xxx) to discover a device zone layout and
manipulate zone state.

Damien Le Moal (1):
  block: Add 'zoned' queue limit

Hannes Reinecke (6):
  blk-sysfs: Add 'chunk_sectors' to sysfs attributes
  block: update chunk_sectors in blk_stack_limits()
  block: Implement support for zoned block devices
  block: Add 'BLKPREP_DONE' return value
  block: Add 'BLK_MQ_RQ_QUEUE_DONE' return value
  sd: Implement support for ZBC devices

Shaun Tancheff (2):
  block: Define zoned block device operations
  blk-zoned: Add ioctl interface for zone operations

 block/Kconfig                 |    8 +
 block/Makefile                |    1 +
 block/blk-core.c              |   53 +-
 block/blk-merge.c             |   31 +-
 block/blk-mq.c                |    1 +
 block/blk-settings.c          |    5 +
 block/blk-sysfs.c             |   29 ++
 block/blk-zoned.c             |  453 +++++++++++++++++
 block/ioctl.c                 |    8 +
 drivers/scsi/Makefile         |    1 +
 drivers/scsi/scsi_lib.c       |    4 +
 drivers/scsi/sd.c             |  147 +++++-
 drivers/scsi/sd.h             |   68 +++
 drivers/scsi/sd_zbc.c         | 1097 +++++++++++++++++++++++++++++++++++++++++
 include/linux/bio.h           |   36 +-
 include/linux/blk-mq.h        |    1 +
 include/linux/blk_types.h     |   27 +-
 include/linux/blkdev.h        |  146 ++++++
 include/scsi/scsi_proto.h     |   17 +
 include/uapi/linux/Kbuild     |    1 +
 include/uapi/linux/blkzoned.h |   91 ++++
 include/uapi/linux/fs.h       |    1 +
 22 files changed, 2170 insertions(+), 56 deletions(-)
 create mode 100644 block/blk-zoned.c
 create mode 100644 drivers/scsi/sd_zbc.c
 create mode 100644 include/uapi/linux/blkzoned.h

-- 
2.7.4


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 1/9] block: Add 'zoned' queue limit
  2016-09-19 21:27 ` Damien Le Moal
@ 2016-09-19 21:27   ` Damien Le Moal
  -1 siblings, 0 replies; 36+ messages in thread
From: Damien Le Moal @ 2016-09-19 21:27 UTC (permalink / raw)
  To: linux-scsi, linux-block
  Cc: martin.petersen, axboe, hare, shaun.tancheff, Damien Le Moal

Add the zoned queue limit to indicate the zoning model of a block
device. Defined values are 0 (BLK_ZONED_NONE) for regular block
devices, 1 (BLK_ZONED_HA) for host-aware zone block devices and 2
(BLK_ZONED_HM) for host-managed zone block devices. The drive managed
model is not defined here since these block devices do not provide any
command for accessing zone information. The helper functions
blk_queue_zoned and bdev_zoned return the zoned limit which can in turn
be used as a boolean to test if a block device is zoned.

The zoned attribute is also exported as a string to applications via
sysfs. BLK_ZONED_NONE shows as "none", BLK_ZONED_HA as "host-aware" and
BLK_ZONED_HM as "host-managed".

Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
---
 block/blk-settings.c   |  1 +
 block/blk-sysfs.c      | 18 ++++++++++++++++++
 include/linux/blkdev.h | 25 +++++++++++++++++++++++++
 3 files changed, 44 insertions(+)

diff --git a/block/blk-settings.c b/block/blk-settings.c
index f679ae1..b1d5b7f 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -107,6 +107,7 @@ void blk_set_default_limits(struct queue_limits *lim)
 	lim->io_opt = 0;
 	lim->misaligned = 0;
 	lim->cluster = 1;
+	lim->zoned = BLK_ZONED_NONE;
 }
 EXPORT_SYMBOL(blk_set_default_limits);
 
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index f87a7e7..31ecff9 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -257,6 +257,18 @@ QUEUE_SYSFS_BIT_FNS(random, ADD_RANDOM, 0);
 QUEUE_SYSFS_BIT_FNS(iostats, IO_STAT, 0);
 #undef QUEUE_SYSFS_BIT_FNS
 
+static ssize_t queue_zoned_show(struct request_queue *q, char *page)
+{
+	switch (blk_queue_zoned(q)) {
+	case BLK_ZONED_HA:
+		return sprintf(page, "host-aware\n");
+	case BLK_ZONED_HM:
+		return sprintf(page, "host-managed\n");
+	default:
+		return sprintf(page, "none\n");
+	}
+}
+
 static ssize_t queue_nomerges_show(struct request_queue *q, char *page)
 {
 	return queue_var_show((blk_queue_nomerges(q) << 1) |
@@ -485,6 +497,11 @@ static struct queue_sysfs_entry queue_nonrot_entry = {
 	.store = queue_store_nonrot,
 };
 
+static struct queue_sysfs_entry queue_zoned_entry = {
+	.attr = {.name = "zoned", .mode = S_IRUGO },
+	.show = queue_zoned_show,
+};
+
 static struct queue_sysfs_entry queue_nomerges_entry = {
 	.attr = {.name = "nomerges", .mode = S_IRUGO | S_IWUSR },
 	.show = queue_nomerges_show,
@@ -546,6 +563,7 @@ static struct attribute *default_attrs[] = {
 	&queue_discard_zeroes_data_entry.attr,
 	&queue_write_same_max_entry.attr,
 	&queue_nonrot_entry.attr,
+	&queue_zoned_entry.attr,
 	&queue_nomerges_entry.attr,
 	&queue_rq_affinity_entry.attr,
 	&queue_iostats_entry.attr,
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index e79055c..1c74b19 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -261,6 +261,15 @@ struct blk_queue_tag {
 #define BLK_SCSI_MAX_CMDS	(256)
 #define BLK_SCSI_CMD_PER_LONG	(BLK_SCSI_MAX_CMDS / (sizeof(long) * 8))
 
+/*
+ * Zoned block device models (zoned limit).
+ */
+enum blk_zoned_model {
+	BLK_ZONED_NONE,	/* Regular block device */
+	BLK_ZONED_HA, 	/* Host-aware zoned block device */
+	BLK_ZONED_HM,	/* Host-managed zoned block device */
+};
+
 struct queue_limits {
 	unsigned long		bounce_pfn;
 	unsigned long		seg_boundary_mask;
@@ -290,6 +299,7 @@ struct queue_limits {
 	unsigned char		cluster;
 	unsigned char		discard_zeroes_data;
 	unsigned char		raid_partial_stripes_expensive;
+	unsigned char		zoned;
 };
 
 struct request_queue {
@@ -627,6 +637,11 @@ static inline unsigned int blk_queue_cluster(struct request_queue *q)
 	return q->limits.cluster;
 }
 
+static inline unsigned int blk_queue_zoned(struct request_queue *q)
+{
+	return q->limits.zoned;
+}
+
 /*
  * We regard a request as sync, if either a read or a sync write
  */
@@ -1354,6 +1369,16 @@ static inline unsigned int bdev_write_same(struct block_device *bdev)
 	return 0;
 }
 
+static inline unsigned int bdev_zoned(struct block_device *bdev)
+{
+	struct request_queue *q = bdev_get_queue(bdev);
+
+	if (q)
+		return blk_queue_zoned(q);
+
+	return 0;
+}
+
 static inline int queue_dma_alignment(struct request_queue *q)
 {
 	return q ? q->dma_alignment : 511;
-- 
2.7.4

Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice & Disclaimer:

This e-mail and any files transmitted with it may contain confidential or legally privileged information of WDC and/or its affiliates, and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system.

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 1/9] block: Add 'zoned' queue limit
@ 2016-09-19 21:27   ` Damien Le Moal
  0 siblings, 0 replies; 36+ messages in thread
From: Damien Le Moal @ 2016-09-19 21:27 UTC (permalink / raw)
  To: linux-scsi, linux-block
  Cc: martin.petersen, axboe, hare, shaun.tancheff, Damien Le Moal

Add the zoned queue limit to indicate the zoning model of a block
device. Defined values are 0 (BLK_ZONED_NONE) for regular block
devices, 1 (BLK_ZONED_HA) for host-aware zone block devices and 2
(BLK_ZONED_HM) for host-managed zone block devices. The drive managed
model is not defined here since these block devices do not provide any
command for accessing zone information. The helper functions
blk_queue_zoned and bdev_zoned return the zoned limit which can in turn
be used as a boolean to test if a block device is zoned.

The zoned attribute is also exported as a string to applications via
sysfs. BLK_ZONED_NONE shows as "none", BLK_ZONED_HA as "host-aware" and
BLK_ZONED_HM as "host-managed".

Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
---
 block/blk-settings.c   |  1 +
 block/blk-sysfs.c      | 18 ++++++++++++++++++
 include/linux/blkdev.h | 25 +++++++++++++++++++++++++
 3 files changed, 44 insertions(+)

diff --git a/block/blk-settings.c b/block/blk-settings.c
index f679ae1..b1d5b7f 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -107,6 +107,7 @@ void blk_set_default_limits(struct queue_limits *lim)
 	lim->io_opt = 0;
 	lim->misaligned = 0;
 	lim->cluster = 1;
+	lim->zoned = BLK_ZONED_NONE;
 }
 EXPORT_SYMBOL(blk_set_default_limits);
 
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index f87a7e7..31ecff9 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -257,6 +257,18 @@ QUEUE_SYSFS_BIT_FNS(random, ADD_RANDOM, 0);
 QUEUE_SYSFS_BIT_FNS(iostats, IO_STAT, 0);
 #undef QUEUE_SYSFS_BIT_FNS
 
+static ssize_t queue_zoned_show(struct request_queue *q, char *page)
+{
+	switch (blk_queue_zoned(q)) {
+	case BLK_ZONED_HA:
+		return sprintf(page, "host-aware\n");
+	case BLK_ZONED_HM:
+		return sprintf(page, "host-managed\n");
+	default:
+		return sprintf(page, "none\n");
+	}
+}
+
 static ssize_t queue_nomerges_show(struct request_queue *q, char *page)
 {
 	return queue_var_show((blk_queue_nomerges(q) << 1) |
@@ -485,6 +497,11 @@ static struct queue_sysfs_entry queue_nonrot_entry = {
 	.store = queue_store_nonrot,
 };
 
+static struct queue_sysfs_entry queue_zoned_entry = {
+	.attr = {.name = "zoned", .mode = S_IRUGO },
+	.show = queue_zoned_show,
+};
+
 static struct queue_sysfs_entry queue_nomerges_entry = {
 	.attr = {.name = "nomerges", .mode = S_IRUGO | S_IWUSR },
 	.show = queue_nomerges_show,
@@ -546,6 +563,7 @@ static struct attribute *default_attrs[] = {
 	&queue_discard_zeroes_data_entry.attr,
 	&queue_write_same_max_entry.attr,
 	&queue_nonrot_entry.attr,
+	&queue_zoned_entry.attr,
 	&queue_nomerges_entry.attr,
 	&queue_rq_affinity_entry.attr,
 	&queue_iostats_entry.attr,
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index e79055c..1c74b19 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -261,6 +261,15 @@ struct blk_queue_tag {
 #define BLK_SCSI_MAX_CMDS	(256)
 #define BLK_SCSI_CMD_PER_LONG	(BLK_SCSI_MAX_CMDS / (sizeof(long) * 8))
 
+/*
+ * Zoned block device models (zoned limit).
+ */
+enum blk_zoned_model {
+	BLK_ZONED_NONE,	/* Regular block device */
+	BLK_ZONED_HA, 	/* Host-aware zoned block device */
+	BLK_ZONED_HM,	/* Host-managed zoned block device */
+};
+
 struct queue_limits {
 	unsigned long		bounce_pfn;
 	unsigned long		seg_boundary_mask;
@@ -290,6 +299,7 @@ struct queue_limits {
 	unsigned char		cluster;
 	unsigned char		discard_zeroes_data;
 	unsigned char		raid_partial_stripes_expensive;
+	unsigned char		zoned;
 };
 
 struct request_queue {
@@ -627,6 +637,11 @@ static inline unsigned int blk_queue_cluster(struct request_queue *q)
 	return q->limits.cluster;
 }
 
+static inline unsigned int blk_queue_zoned(struct request_queue *q)
+{
+	return q->limits.zoned;
+}
+
 /*
  * We regard a request as sync, if either a read or a sync write
  */
@@ -1354,6 +1369,16 @@ static inline unsigned int bdev_write_same(struct block_device *bdev)
 	return 0;
 }
 
+static inline unsigned int bdev_zoned(struct block_device *bdev)
+{
+	struct request_queue *q = bdev_get_queue(bdev);
+
+	if (q)
+		return blk_queue_zoned(q);
+
+	return 0;
+}
+
 static inline int queue_dma_alignment(struct request_queue *q)
 {
 	return q ? q->dma_alignment : 511;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 2/9] blk-sysfs: Add 'chunk_sectors' to sysfs attributes
  2016-09-19 21:27 ` Damien Le Moal
@ 2016-09-19 21:27   ` Damien Le Moal
  -1 siblings, 0 replies; 36+ messages in thread
From: Damien Le Moal @ 2016-09-19 21:27 UTC (permalink / raw)
  To: linux-scsi, linux-block
  Cc: martin.petersen, axboe, hare, shaun.tancheff, Damien Le Moal

From: Hannes Reinecke <hare@suse.de>

The queue limits already have a 'chunk_sectors' setting, so
we should be presenting it via sysfs.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
---
 block/blk-sysfs.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index 31ecff9..15e5baf 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -130,6 +130,11 @@ static ssize_t queue_physical_block_size_show(struct request_queue *q, char *pag
 	return queue_var_show(queue_physical_block_size(q), page);
 }
 
+static ssize_t queue_chunk_sectors_show(struct request_queue *q, char *page)
+{
+	return queue_var_show(q->limits.chunk_sectors, page);
+}
+
 static ssize_t queue_io_min_show(struct request_queue *q, char *page)
 {
 	return queue_var_show(queue_io_min(q), page);
@@ -455,6 +460,11 @@ static struct queue_sysfs_entry queue_physical_block_size_entry = {
 	.show = queue_physical_block_size_show,
 };
 
+static struct queue_sysfs_entry queue_chunk_sectors_entry = {
+	.attr = {.name = "chunk_sectors", .mode = S_IRUGO },
+	.show = queue_chunk_sectors_show,
+};
+
 static struct queue_sysfs_entry queue_io_min_entry = {
 	.attr = {.name = "minimum_io_size", .mode = S_IRUGO },
 	.show = queue_io_min_show,
@@ -555,6 +565,7 @@ static struct attribute *default_attrs[] = {
 	&queue_hw_sector_size_entry.attr,
 	&queue_logical_block_size_entry.attr,
 	&queue_physical_block_size_entry.attr,
+	&queue_chunk_sectors_entry.attr,
 	&queue_io_min_entry.attr,
 	&queue_io_opt_entry.attr,
 	&queue_discard_granularity_entry.attr,
-- 
2.7.4

Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice & Disclaimer:

This e-mail and any files transmitted with it may contain confidential or legally privileged information of WDC and/or its affiliates, and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system.

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 2/9] blk-sysfs: Add 'chunk_sectors' to sysfs attributes
@ 2016-09-19 21:27   ` Damien Le Moal
  0 siblings, 0 replies; 36+ messages in thread
From: Damien Le Moal @ 2016-09-19 21:27 UTC (permalink / raw)
  To: linux-scsi, linux-block
  Cc: martin.petersen, axboe, hare, shaun.tancheff, Damien Le Moal

From: Hannes Reinecke <hare@suse.de>

The queue limits already have a 'chunk_sectors' setting, so
we should be presenting it via sysfs.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
---
 block/blk-sysfs.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index 31ecff9..15e5baf 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -130,6 +130,11 @@ static ssize_t queue_physical_block_size_show(struct request_queue *q, char *pag
 	return queue_var_show(queue_physical_block_size(q), page);
 }
 
+static ssize_t queue_chunk_sectors_show(struct request_queue *q, char *page)
+{
+	return queue_var_show(q->limits.chunk_sectors, page);
+}
+
 static ssize_t queue_io_min_show(struct request_queue *q, char *page)
 {
 	return queue_var_show(queue_io_min(q), page);
@@ -455,6 +460,11 @@ static struct queue_sysfs_entry queue_physical_block_size_entry = {
 	.show = queue_physical_block_size_show,
 };
 
+static struct queue_sysfs_entry queue_chunk_sectors_entry = {
+	.attr = {.name = "chunk_sectors", .mode = S_IRUGO },
+	.show = queue_chunk_sectors_show,
+};
+
 static struct queue_sysfs_entry queue_io_min_entry = {
 	.attr = {.name = "minimum_io_size", .mode = S_IRUGO },
 	.show = queue_io_min_show,
@@ -555,6 +565,7 @@ static struct attribute *default_attrs[] = {
 	&queue_hw_sector_size_entry.attr,
 	&queue_logical_block_size_entry.attr,
 	&queue_physical_block_size_entry.attr,
+	&queue_chunk_sectors_entry.attr,
 	&queue_io_min_entry.attr,
 	&queue_io_opt_entry.attr,
 	&queue_discard_granularity_entry.attr,
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 3/9] block: update chunk_sectors in blk_stack_limits()
  2016-09-19 21:27 ` Damien Le Moal
@ 2016-09-19 21:27   ` Damien Le Moal
  -1 siblings, 0 replies; 36+ messages in thread
From: Damien Le Moal @ 2016-09-19 21:27 UTC (permalink / raw)
  To: linux-scsi, linux-block
  Cc: martin.petersen, axboe, hare, shaun.tancheff, Hannes Reinecke,
	Damien Le Moal

From: Hannes Reinecke <hare@suse.de>

Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
---
 block/blk-settings.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/block/blk-settings.c b/block/blk-settings.c
index b1d5b7f..55369a6 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -631,6 +631,10 @@ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b,
 			t->discard_granularity;
 	}
 
+	if (b->chunk_sectors)
+		t->chunk_sectors = min_not_zero(t->chunk_sectors,
+						b->chunk_sectors);
+
 	return ret;
 }
 EXPORT_SYMBOL(blk_stack_limits);
-- 
2.7.4

Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice & Disclaimer:

This e-mail and any files transmitted with it may contain confidential or legally privileged information of WDC and/or its affiliates, and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system.

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 3/9] block: update chunk_sectors in blk_stack_limits()
@ 2016-09-19 21:27   ` Damien Le Moal
  0 siblings, 0 replies; 36+ messages in thread
From: Damien Le Moal @ 2016-09-19 21:27 UTC (permalink / raw)
  To: linux-scsi, linux-block
  Cc: martin.petersen, axboe, hare, shaun.tancheff, Hannes Reinecke,
	Damien Le Moal

From: Hannes Reinecke <hare@suse.de>

Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
---
 block/blk-settings.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/block/blk-settings.c b/block/blk-settings.c
index b1d5b7f..55369a6 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -631,6 +631,10 @@ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b,
 			t->discard_granularity;
 	}
 
+	if (b->chunk_sectors)
+		t->chunk_sectors = min_not_zero(t->chunk_sectors,
+						b->chunk_sectors);
+
 	return ret;
 }
 EXPORT_SYMBOL(blk_stack_limits);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 4/9] block: Define zoned block device operations
  2016-09-19 21:27 ` Damien Le Moal
@ 2016-09-19 21:27   ` Damien Le Moal
  -1 siblings, 0 replies; 36+ messages in thread
From: Damien Le Moal @ 2016-09-19 21:27 UTC (permalink / raw)
  To: linux-scsi, linux-block
  Cc: martin.petersen, axboe, hare, shaun.tancheff, Damien Le Moal

From: Shaun Tancheff <shaun.tancheff@seagate.com>

Define REQ_OP_ZONE_REPORT, REQ_OP_ZONE_RESET, REQ_OP_ZONE_OPEN,
REQ_OP_ZONE_CLOSE and REQ_OP_ZONE_FINISH for handling zones of
zoned block devices (host-managed and host-aware). With with these
new commands, the total number of operations defined reaches 11 and
requires increasing REQ_OP_BITS from 3 to 4.

Signed-off-by: Shaun Tancheff <shaun.tancheff@seagate.com>

Changelog (Damien):
All requests have no payload and may operate on all zones of the
device (when the BIO sector and size are 0) or on a single zone
(when the BIO sector and size are aigned on a zone).

REQ_OP_ZONE_REPORT is not sent directly to the device
and is processed in sd_zbc.c using the device zone work
in order to parse the report reply and manage changes to
the zone information cache of the device.

Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
---
 block/blk-core.c          |  7 +++++++
 block/blk-merge.c         | 31 +++++++++++++++++++++++++++----
 include/linux/bio.h       | 36 +++++++++++++++++++++++++++---------
 include/linux/blk_types.h | 27 ++++++++++++++++++++++++++-
 4 files changed, 87 insertions(+), 14 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 36c7ac3..4a7f7ba 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1941,6 +1941,13 @@ generic_make_request_checks(struct bio *bio)
 	case REQ_OP_WRITE_SAME:
 		if (!bdev_write_same(bio->bi_bdev))
 			goto not_supported;
+	case REQ_OP_ZONE_REPORT:
+	case REQ_OP_ZONE_RESET:
+	case REQ_OP_ZONE_OPEN:
+	case REQ_OP_ZONE_CLOSE:
+	case REQ_OP_ZONE_FINISH:
+		if (!bdev_zoned(bio->bi_bdev))
+			goto not_supported;
 		break;
 	default:
 		break;
diff --git a/block/blk-merge.c b/block/blk-merge.c
index 2642e5f..f9299df 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -202,6 +202,21 @@ void blk_queue_split(struct request_queue *q, struct bio **bio,
 	case REQ_OP_WRITE_SAME:
 		split = blk_bio_write_same_split(q, *bio, bs, &nsegs);
 		break;
+	case REQ_OP_ZONE_REPORT:
+	case REQ_OP_ZONE_RESET:
+	case REQ_OP_ZONE_OPEN:
+	case REQ_OP_ZONE_CLOSE:
+	case REQ_OP_ZONE_FINISH:
+		/*
+		 * For these commands, bi_size is either 0 to specify
+		 * operation on the entire block device sector range,
+		 * or a zone size for operation on a single zone.
+		 * Since a zone size may be much bigger than the maximum
+		 * allowed BIO size, we cannot use blk_bio_segment_split.
+		 */
+		split = NULL;
+		nsegs = 0;
+		break;
 	default:
 		split = blk_bio_segment_split(q, *bio, q->bio_split, &nsegs);
 		break;
@@ -241,11 +256,19 @@ static unsigned int __blk_recalc_rq_segments(struct request_queue *q,
 	 * This should probably be returning 0, but blk_add_request_payload()
 	 * (Christoph!!!!)
 	 */
-	if (bio_op(bio) == REQ_OP_DISCARD || bio_op(bio) == REQ_OP_SECURE_ERASE)
-		return 1;
-
-	if (bio_op(bio) == REQ_OP_WRITE_SAME)
+	switch(bio_op(bio)) {
+	case REQ_OP_DISCARD:
+	case REQ_OP_SECURE_ERASE:
+	case REQ_OP_WRITE_SAME:
+	case REQ_OP_ZONE_REPORT:
+	case REQ_OP_ZONE_RESET:
+	case REQ_OP_ZONE_OPEN:
+	case REQ_OP_ZONE_CLOSE:
+	case REQ_OP_ZONE_FINISH:
 		return 1;
+	default:
+		break;
+	}
 
 	fbio = bio;
 	cluster = blk_queue_cluster(q);
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 23ddf4b..d9c2e21 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -69,20 +69,38 @@
  */
 static inline bool bio_has_data(struct bio *bio)
 {
-	if (bio &&
-	    bio->bi_iter.bi_size &&
-	    bio_op(bio) != REQ_OP_DISCARD &&
-	    bio_op(bio) != REQ_OP_SECURE_ERASE)
-		return true;
+	if (!bio || !bio->bi_iter.bi_size)
+		return false;
 
-	return false;
+	switch (bio_op(bio)) {
+	case REQ_OP_DISCARD:
+	case REQ_OP_SECURE_ERASE:
+	case REQ_OP_ZONE_REPORT:
+	case REQ_OP_ZONE_RESET:
+	case REQ_OP_ZONE_OPEN:
+	case REQ_OP_ZONE_CLOSE:
+	case REQ_OP_ZONE_FINISH:
+		return false;
+	default:
+		return true;
+	}
 }
 
 static inline bool bio_no_advance_iter(struct bio *bio)
 {
-	return bio_op(bio) == REQ_OP_DISCARD ||
-	       bio_op(bio) == REQ_OP_SECURE_ERASE ||
-	       bio_op(bio) == REQ_OP_WRITE_SAME;
+	switch (bio_op(bio)) {
+	case REQ_OP_DISCARD:
+	case REQ_OP_SECURE_ERASE:
+	case REQ_OP_WRITE_SAME:
+	case REQ_OP_ZONE_REPORT:
+	case REQ_OP_ZONE_RESET:
+	case REQ_OP_ZONE_OPEN:
+	case REQ_OP_ZONE_CLOSE:
+	case REQ_OP_ZONE_FINISH:
+		return true;
+	default:
+		return false;
+	}
 }
 
 static inline bool bio_is_rw(struct bio *bio)
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index 436f43f..70df996 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -229,6 +229,26 @@ enum rq_flag_bits {
 #define REQ_HASHED		(1ULL << __REQ_HASHED)
 #define REQ_MQ_INFLIGHT		(1ULL << __REQ_MQ_INFLIGHT)
 
+/*
+ * Note on zone operations:
+ * All REQ_OP_ZONE_* commands do not have a payload and share a common
+ * interface for specifying operation range:
+ * (1) bio->bi_iter.bi_sector and bio->bi_iter.bi_size set to 0:
+ *     the command is to operate on ALL zones of the device.
+ * (2) bio->bi_iter.bi_sector is set to a zone start sector and
+ *     bio->bi_iter.bi_size is set to the zone size in bytes:
+ *     the command is to operate on only the specified zone.
+ * Operation:
+ * REQ_OP_ZONE_REPORT: Request information for all zones or for a single zone.
+ * REQ_OP_ZONE_RESET: Reset the write pointer of all zones or of a single zone.
+ * REQ_OP_ZONE_OPEN: Explicitely open the maximum allowed number of zones or
+ *                   a single zone. For the former case, the zones that will
+ *                   actually be open are chosen by the disk.
+ * REQ_OP_ZONE_CLOSE: Close all implicitely or explicitely open zones or
+ *                    a single zone.
+ * REQ_OP_ZONE_FINISH: Transition one or all open and closed zones to the full
+ *                     condition.
+ */
 enum req_op {
 	REQ_OP_READ,
 	REQ_OP_WRITE,
@@ -236,9 +256,14 @@ enum req_op {
 	REQ_OP_SECURE_ERASE,	/* request to securely erase sectors */
 	REQ_OP_WRITE_SAME,	/* write same block many times */
 	REQ_OP_FLUSH,		/* request for cache flush */
+	REQ_OP_ZONE_REPORT,	/* Get zone information */
+	REQ_OP_ZONE_RESET,	/* Reset a zone write pointer */
+	REQ_OP_ZONE_OPEN,	/* Explicitely open a zone */
+	REQ_OP_ZONE_CLOSE,	/* Close an open zone */
+	REQ_OP_ZONE_FINISH,	/* Finish a zone */
 };
 
-#define REQ_OP_BITS 3
+#define REQ_OP_BITS 4
 
 typedef unsigned int blk_qc_t;
 #define BLK_QC_T_NONE	-1U
-- 
2.7.4

Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice & Disclaimer:

This e-mail and any files transmitted with it may contain confidential or legally privileged information of WDC and/or its affiliates, and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system.

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 4/9] block: Define zoned block device operations
@ 2016-09-19 21:27   ` Damien Le Moal
  0 siblings, 0 replies; 36+ messages in thread
From: Damien Le Moal @ 2016-09-19 21:27 UTC (permalink / raw)
  To: linux-scsi, linux-block
  Cc: martin.petersen, axboe, hare, shaun.tancheff, Damien Le Moal

From: Shaun Tancheff <shaun.tancheff@seagate.com>

Define REQ_OP_ZONE_REPORT, REQ_OP_ZONE_RESET, REQ_OP_ZONE_OPEN,
REQ_OP_ZONE_CLOSE and REQ_OP_ZONE_FINISH for handling zones of
zoned block devices (host-managed and host-aware). With with these
new commands, the total number of operations defined reaches 11 and
requires increasing REQ_OP_BITS from 3 to 4.

Signed-off-by: Shaun Tancheff <shaun.tancheff@seagate.com>

Changelog (Damien):
All requests have no payload and may operate on all zones of the
device (when the BIO sector and size are 0) or on a single zone
(when the BIO sector and size are aigned on a zone).

REQ_OP_ZONE_REPORT is not sent directly to the device
and is processed in sd_zbc.c using the device zone work
in order to parse the report reply and manage changes to
the zone information cache of the device.

Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
---
 block/blk-core.c          |  7 +++++++
 block/blk-merge.c         | 31 +++++++++++++++++++++++++++----
 include/linux/bio.h       | 36 +++++++++++++++++++++++++++---------
 include/linux/blk_types.h | 27 ++++++++++++++++++++++++++-
 4 files changed, 87 insertions(+), 14 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 36c7ac3..4a7f7ba 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1941,6 +1941,13 @@ generic_make_request_checks(struct bio *bio)
 	case REQ_OP_WRITE_SAME:
 		if (!bdev_write_same(bio->bi_bdev))
 			goto not_supported;
+	case REQ_OP_ZONE_REPORT:
+	case REQ_OP_ZONE_RESET:
+	case REQ_OP_ZONE_OPEN:
+	case REQ_OP_ZONE_CLOSE:
+	case REQ_OP_ZONE_FINISH:
+		if (!bdev_zoned(bio->bi_bdev))
+			goto not_supported;
 		break;
 	default:
 		break;
diff --git a/block/blk-merge.c b/block/blk-merge.c
index 2642e5f..f9299df 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -202,6 +202,21 @@ void blk_queue_split(struct request_queue *q, struct bio **bio,
 	case REQ_OP_WRITE_SAME:
 		split = blk_bio_write_same_split(q, *bio, bs, &nsegs);
 		break;
+	case REQ_OP_ZONE_REPORT:
+	case REQ_OP_ZONE_RESET:
+	case REQ_OP_ZONE_OPEN:
+	case REQ_OP_ZONE_CLOSE:
+	case REQ_OP_ZONE_FINISH:
+		/*
+		 * For these commands, bi_size is either 0 to specify
+		 * operation on the entire block device sector range,
+		 * or a zone size for operation on a single zone.
+		 * Since a zone size may be much bigger than the maximum
+		 * allowed BIO size, we cannot use blk_bio_segment_split.
+		 */
+		split = NULL;
+		nsegs = 0;
+		break;
 	default:
 		split = blk_bio_segment_split(q, *bio, q->bio_split, &nsegs);
 		break;
@@ -241,11 +256,19 @@ static unsigned int __blk_recalc_rq_segments(struct request_queue *q,
 	 * This should probably be returning 0, but blk_add_request_payload()
 	 * (Christoph!!!!)
 	 */
-	if (bio_op(bio) == REQ_OP_DISCARD || bio_op(bio) == REQ_OP_SECURE_ERASE)
-		return 1;
-
-	if (bio_op(bio) == REQ_OP_WRITE_SAME)
+	switch(bio_op(bio)) {
+	case REQ_OP_DISCARD:
+	case REQ_OP_SECURE_ERASE:
+	case REQ_OP_WRITE_SAME:
+	case REQ_OP_ZONE_REPORT:
+	case REQ_OP_ZONE_RESET:
+	case REQ_OP_ZONE_OPEN:
+	case REQ_OP_ZONE_CLOSE:
+	case REQ_OP_ZONE_FINISH:
 		return 1;
+	default:
+		break;
+	}
 
 	fbio = bio;
 	cluster = blk_queue_cluster(q);
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 23ddf4b..d9c2e21 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -69,20 +69,38 @@
  */
 static inline bool bio_has_data(struct bio *bio)
 {
-	if (bio &&
-	    bio->bi_iter.bi_size &&
-	    bio_op(bio) != REQ_OP_DISCARD &&
-	    bio_op(bio) != REQ_OP_SECURE_ERASE)
-		return true;
+	if (!bio || !bio->bi_iter.bi_size)
+		return false;
 
-	return false;
+	switch (bio_op(bio)) {
+	case REQ_OP_DISCARD:
+	case REQ_OP_SECURE_ERASE:
+	case REQ_OP_ZONE_REPORT:
+	case REQ_OP_ZONE_RESET:
+	case REQ_OP_ZONE_OPEN:
+	case REQ_OP_ZONE_CLOSE:
+	case REQ_OP_ZONE_FINISH:
+		return false;
+	default:
+		return true;
+	}
 }
 
 static inline bool bio_no_advance_iter(struct bio *bio)
 {
-	return bio_op(bio) == REQ_OP_DISCARD ||
-	       bio_op(bio) == REQ_OP_SECURE_ERASE ||
-	       bio_op(bio) == REQ_OP_WRITE_SAME;
+	switch (bio_op(bio)) {
+	case REQ_OP_DISCARD:
+	case REQ_OP_SECURE_ERASE:
+	case REQ_OP_WRITE_SAME:
+	case REQ_OP_ZONE_REPORT:
+	case REQ_OP_ZONE_RESET:
+	case REQ_OP_ZONE_OPEN:
+	case REQ_OP_ZONE_CLOSE:
+	case REQ_OP_ZONE_FINISH:
+		return true;
+	default:
+		return false;
+	}
 }
 
 static inline bool bio_is_rw(struct bio *bio)
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index 436f43f..70df996 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -229,6 +229,26 @@ enum rq_flag_bits {
 #define REQ_HASHED		(1ULL << __REQ_HASHED)
 #define REQ_MQ_INFLIGHT		(1ULL << __REQ_MQ_INFLIGHT)
 
+/*
+ * Note on zone operations:
+ * All REQ_OP_ZONE_* commands do not have a payload and share a common
+ * interface for specifying operation range:
+ * (1) bio->bi_iter.bi_sector and bio->bi_iter.bi_size set to 0:
+ *     the command is to operate on ALL zones of the device.
+ * (2) bio->bi_iter.bi_sector is set to a zone start sector and
+ *     bio->bi_iter.bi_size is set to the zone size in bytes:
+ *     the command is to operate on only the specified zone.
+ * Operation:
+ * REQ_OP_ZONE_REPORT: Request information for all zones or for a single zone.
+ * REQ_OP_ZONE_RESET: Reset the write pointer of all zones or of a single zone.
+ * REQ_OP_ZONE_OPEN: Explicitely open the maximum allowed number of zones or
+ *                   a single zone. For the former case, the zones that will
+ *                   actually be open are chosen by the disk.
+ * REQ_OP_ZONE_CLOSE: Close all implicitely or explicitely open zones or
+ *                    a single zone.
+ * REQ_OP_ZONE_FINISH: Transition one or all open and closed zones to the full
+ *                     condition.
+ */
 enum req_op {
 	REQ_OP_READ,
 	REQ_OP_WRITE,
@@ -236,9 +256,14 @@ enum req_op {
 	REQ_OP_SECURE_ERASE,	/* request to securely erase sectors */
 	REQ_OP_WRITE_SAME,	/* write same block many times */
 	REQ_OP_FLUSH,		/* request for cache flush */
+	REQ_OP_ZONE_REPORT,	/* Get zone information */
+	REQ_OP_ZONE_RESET,	/* Reset a zone write pointer */
+	REQ_OP_ZONE_OPEN,	/* Explicitely open a zone */
+	REQ_OP_ZONE_CLOSE,	/* Close an open zone */
+	REQ_OP_ZONE_FINISH,	/* Finish a zone */
 };
 
-#define REQ_OP_BITS 3
+#define REQ_OP_BITS 4
 
 typedef unsigned int blk_qc_t;
 #define BLK_QC_T_NONE	-1U
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 5/9] block: Implement support for zoned block devices
  2016-09-19 21:27 ` Damien Le Moal
@ 2016-09-19 21:27   ` Damien Le Moal
  -1 siblings, 0 replies; 36+ messages in thread
From: Damien Le Moal @ 2016-09-19 21:27 UTC (permalink / raw)
  To: linux-scsi, linux-block
  Cc: martin.petersen, axboe, hare, shaun.tancheff, Damien Le Moal

From: Hannes Reinecke <hare@suse.de>

Implement a RB-Tree holding a zoned block device zone information
(struct blk_zone) and add support functions for maintaining the
RB-Tree and manipulating zone structs. The block layer support does
not differentiate between host-aware and host-managed devices. The
different constraints for these different zone models are handled
by the generic SCSI layer sd driver down the stack.

Signed-off-by: Hannes Reinecke <hare@suse.de>

Changelog (Damien):
* Changed struct blk_zone to be more compact (64B)
* Changed zone locking to use bit_spin_lock in place of a regular
  spinlock
* Request zone operations to the underlying block device driver
  through BIO operations with the operation codes REQ_OP_ZONE_*.

Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
---
 block/Kconfig          |   8 ++
 block/Makefile         |   1 +
 block/blk-core.c       |   4 +
 block/blk-zoned.c      | 338 +++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/blkdev.h | 113 +++++++++++++++++
 5 files changed, 464 insertions(+)
 create mode 100644 block/blk-zoned.c

diff --git a/block/Kconfig b/block/Kconfig
index 161491d..c3a18f0 100644
--- a/block/Kconfig
+++ b/block/Kconfig
@@ -88,6 +88,14 @@ config BLK_DEV_INTEGRITY
 	T10/SCSI Data Integrity Field or the T13/ATA External Path
 	Protection.  If in doubt, say N.
 
+config BLK_DEV_ZONED
+	bool "Zoned block device support"
+	---help---
+	Block layer zoned block device support. This option enables
+	support for ZAC/ZBC host-managed and host-aware zoned block devices.
+
+	Say yes here if you have a ZAC or ZBC storage device.
+
 config BLK_DEV_THROTTLING
 	bool "Block layer bio throttling support"
 	depends on BLK_CGROUP=y
diff --git a/block/Makefile b/block/Makefile
index 9eda232..aee67fa 100644
--- a/block/Makefile
+++ b/block/Makefile
@@ -22,4 +22,5 @@ obj-$(CONFIG_IOSCHED_CFQ)	+= cfq-iosched.o
 obj-$(CONFIG_BLOCK_COMPAT)	+= compat_ioctl.o
 obj-$(CONFIG_BLK_CMDLINE_PARSER)	+= cmdline-parser.o
 obj-$(CONFIG_BLK_DEV_INTEGRITY) += bio-integrity.o blk-integrity.o t10-pi.o
+obj-$(CONFIG_BLK_DEV_ZONED)	+= blk-zoned.o
 
diff --git a/block/blk-core.c b/block/blk-core.c
index 4a7f7ba..2c5d069d 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -590,6 +590,8 @@ void blk_cleanup_queue(struct request_queue *q)
 		blk_mq_free_queue(q);
 	percpu_ref_exit(&q->q_usage_counter);
 
+	blk_drop_zones(q);
+
 	spin_lock_irq(lock);
 	if (q->queue_lock != &q->__queue_lock)
 		q->queue_lock = &q->__queue_lock;
@@ -728,6 +730,8 @@ struct request_queue *blk_alloc_queue_node(gfp_t gfp_mask, int node_id)
 #endif
 	INIT_DELAYED_WORK(&q->delay_work, blk_delay_work);
 
+	blk_init_zones(q);
+
 	kobject_init(&q->kobj, &blk_queue_ktype);
 
 	mutex_init(&q->sysfs_lock);
diff --git a/block/blk-zoned.c b/block/blk-zoned.c
new file mode 100644
index 0000000..a107940
--- /dev/null
+++ b/block/blk-zoned.c
@@ -0,0 +1,338 @@
+/*
+ * Zoned block device handling
+ *
+ * Copyright (c) 2015, Hannes Reinecke
+ * Copyright (c) 2015, SUSE Linux GmbH
+ *
+ * Copyright (c) 2016, Damien Le Moal
+ * Copyright (c) 2016, Western Digital
+ */
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/rbtree.h>
+#include <linux/blkdev.h>
+
+void blk_init_zones(struct request_queue *q)
+{
+	spin_lock_init(&q->zones_lock);
+	q->zones = RB_ROOT;
+}
+
+/**
+ * blk_drop_zones - Empty a zoned device zone tree.
+ * @q: queue of the zoned device to operate on
+ *
+ * Free all zone descriptors added to the queue zone tree.
+ */
+void blk_drop_zones(struct request_queue *q)
+{
+	struct rb_root *root = &q->zones;
+	struct blk_zone *zone, *next;
+
+	rbtree_postorder_for_each_entry_safe(zone, next, root, node)
+		kfree(zone);
+	q->zones = RB_ROOT;
+}
+EXPORT_SYMBOL_GPL(blk_drop_zones);
+
+/**
+ * blk_insert_zone - Add a new zone struct to the queue RB-tree.
+ * @q: queue of the zoned device to operate on
+ * @new_zone: The zone struct to add
+ *
+ * If @new_zone is not already added to the zone tree, add it.
+ * Otherwise, return the existing entry.
+ */
+struct blk_zone *blk_insert_zone(struct request_queue *q,
+				 struct blk_zone *new_zone)
+{
+	struct rb_root *root = &q->zones;
+	struct rb_node **new = &(root->rb_node), *parent = NULL;
+	struct blk_zone *zone = NULL;
+	unsigned long flags;
+
+	spin_lock_irqsave(&q->zones_lock, flags);
+
+	/* Figure out where to put new node */
+	while (*new) {
+		zone = container_of(*new, struct blk_zone, node);
+		parent = *new;
+		if (new_zone->start + new_zone->len <= zone->start)
+			new = &((*new)->rb_left);
+		else if (new_zone->start >= zone->start + zone->len)
+			new = &((*new)->rb_right);
+		else
+			/* Return existing zone */
+			break;
+		zone = NULL;
+	}
+
+	if (!zone) {
+		/* No existing zone: add new node and rebalance tree */
+		rb_link_node(&new_zone->node, parent, new);
+		rb_insert_color(&new_zone->node, root);
+	}
+
+	spin_unlock_irqrestore(&q->zones_lock, flags);
+
+	return zone;
+}
+EXPORT_SYMBOL_GPL(blk_insert_zone);
+
+/**
+ * blk_lookup_zone - Search a zone in a zoned device zone tree.
+ * @q: queue of the zoned device tree to search
+ * @sector: A sector within the zone to search for
+ *
+ * Search the zone containing @sector in the zone tree owned
+ * by @q. NULL is returned if the zone is not found. Since this
+ * can be called concurrently with blk_insert_zone during device
+ * initialization, the tree traversal is protected using the
+ * zones_lock of the queue.
+ */
+struct blk_zone *blk_lookup_zone(struct request_queue *q, sector_t sector)
+{
+	struct rb_root *root = &q->zones;
+	struct rb_node *node = root->rb_node;
+	struct blk_zone *zone = NULL;
+	unsigned long flags;
+
+	spin_lock_irqsave(&q->zones_lock, flags);
+
+	while (node) {
+		zone = container_of(node, struct blk_zone, node);
+		if (sector < zone->start)
+			node = node->rb_left;
+		else if (sector >= zone->start + zone->len)
+			node = node->rb_right;
+		else
+			break;
+		zone = NULL;
+	}
+
+	spin_unlock_irqrestore(&q->zones_lock, flags);
+
+	return zone;
+}
+EXPORT_SYMBOL_GPL(blk_lookup_zone);
+
+/**
+ * Execute a zone operation (REQ_OP_ZONE*)
+ */
+static int blkdev_issue_zone_operation(struct block_device *bdev,
+				       unsigned int op,
+				       sector_t sector, sector_t nr_sects,
+				       gfp_t gfp_mask)
+{
+	struct bio *bio;
+	int ret;
+
+	if (!bdev_zoned(bdev))
+		return -EOPNOTSUPP;
+
+	/*
+	 * Make sure bi_size does not overflow because
+	 * of some weird very large zone size.
+	 */
+	if (nr_sects && (unsigned long long)nr_sects << 9 > UINT_MAX)
+		return -EINVAL;
+
+	bio = bio_alloc(gfp_mask, 1);
+	if (!bio)
+		return -ENOMEM;
+
+	bio->bi_iter.bi_sector = sector;
+	bio->bi_iter.bi_size = nr_sects << 9;
+	bio->bi_vcnt = 0;
+	bio->bi_bdev = bdev;
+	bio_set_op_attrs(bio, op, 0);
+
+	ret = submit_bio_wait(bio);
+
+	bio_put(bio);
+
+	return ret;
+}
+
+/**
+ * blkdev_update_zones - Force an update of a device zone information
+ * @bdev:	Target block device
+ *
+ * Force an update of all zones information of @bdev. This call does not
+ * block waiting for the update to complete. On return, all zones are only
+ * marked as "in-update". Waiting on the zone update to complete can be done
+ * on a per zone basis using the function blk_wait_for_zone_update.
+ */
+int blkdev_update_zones(struct block_device *bdev,
+			gfp_t gfp_mask)
+{
+	return blkdev_issue_zone_operation(bdev, REQ_OP_ZONE_REPORT,
+					   0, 0, gfp_mask);
+}
+
+/*
+ * Wait for a zone update to complete.
+ */
+static void __blk_wait_for_zone_update(struct blk_zone *zone)
+{
+	might_sleep();
+	if (test_bit(BLK_ZONE_IN_UPDATE, &zone->flags))
+		wait_on_bit_io(&zone->flags, BLK_ZONE_IN_UPDATE,
+			       TASK_UNINTERRUPTIBLE);
+}
+
+/**
+ * blk_wait_for_zone_update - Wait for a zone information update
+ * @zone: The zone to wait for
+ *
+ * This must be called with the zone lock held. If @zone is not
+ * under update, returns immediately. Otherwise, wait for the
+ * update flag to be cleared on completion of the zone information
+ * update by the device driver.
+ */
+void blk_wait_for_zone_update(struct blk_zone *zone)
+{
+	WARN_ON_ONCE(!test_bit(BLK_ZONE_LOCKED, &zone->flags));
+	while (test_bit(BLK_ZONE_IN_UPDATE, &zone->flags)) {
+		blk_unlock_zone(zone);
+		__blk_wait_for_zone_update(zone);
+		blk_lock_zone(zone);
+	}
+}
+
+/**
+ * blkdev_report_zone - Get a zone information
+ * @bdev:	Target block device
+ * @sector:	A sector of the zone to report
+ * @update:	Force an update of the zone information
+ * @gfp_mask:	Memory allocation flags (for bio_alloc)
+ *
+ * Get a zone from the zone cache. And return it.
+ * If update is requested, issue a report zone operation
+ * and wait for the zone information to be updated.
+ */
+struct blk_zone *blkdev_report_zone(struct block_device *bdev,
+				    sector_t sector,
+				    bool update,
+				    gfp_t gfp_mask)
+{
+	struct request_queue *q = bdev_get_queue(bdev);
+	struct blk_zone *zone;
+	int ret;
+
+	zone = blk_lookup_zone(q, sector);
+	if (!zone)
+		return ERR_PTR(-ENXIO);
+
+	if (update) {
+		ret = blkdev_issue_zone_operation(bdev, REQ_OP_ZONE_REPORT,
+						  zone->start, zone->len,
+						  gfp_mask);
+		if (ret)
+			return ERR_PTR(ret);
+		__blk_wait_for_zone_update(zone);
+	}
+
+	return zone;
+}
+
+/**
+ * Execute a zone action (open, close, reset or finish).
+ */
+static int blkdev_issue_zone_action(struct block_device *bdev,
+				    sector_t sector, unsigned int op,
+				    gfp_t gfp_mask)
+{
+	struct request_queue *q = bdev_get_queue(bdev);
+	struct blk_zone *zone;
+	sector_t nr_sects;
+	int ret;
+
+	if (!blk_queue_zoned(q))
+		return -EOPNOTSUPP;
+
+	if (sector == ~0ULL) {
+		/* All zones */
+		sector = 0;
+		nr_sects = 0;
+	} else {
+		/* This zone */
+		zone = blk_lookup_zone(q, sector);
+		if (!zone)
+			return -ENXIO;
+		sector = zone->start;
+		nr_sects = zone->len;
+	}
+
+	ret = blkdev_issue_zone_operation(bdev, op, sector,
+					  nr_sects, gfp_mask);
+	if (ret == 0 && !nr_sects)
+		blkdev_update_zones(bdev, gfp_mask);
+
+	return ret;
+}
+
+/**
+ * blkdev_reset_zone - Reset a zone write pointer
+ * @bdev:	target block device
+ * @sector:	A sector of the zone to reset or ~0ul for all zones.
+ * @gfp_mask:	memory allocation flags (for bio_alloc)
+ *
+ * Description:
+ *    Reset a zone or all zones write pointer.
+ */
+int blkdev_reset_zone(struct block_device *bdev,
+		      sector_t sector, gfp_t gfp_mask)
+{
+	return blkdev_issue_zone_action(bdev, sector, REQ_OP_ZONE_RESET,
+					gfp_mask);
+}
+
+/**
+ * blkdev_open_zone - Explicitely open a zone
+ * @bdev:	target block device
+ * @sector:	A sector of the zone to open or ~0ul for all zones.
+ * @gfp_mask:	memory allocation flags (for bio_alloc)
+ *
+ * Description:
+ *    Open a zone or all possible zones.
+ */
+int blkdev_open_zone(struct block_device *bdev,
+		     sector_t sector, gfp_t gfp_mask)
+{
+	return blkdev_issue_zone_action(bdev, sector, REQ_OP_ZONE_OPEN,
+					gfp_mask);
+}
+
+/**
+ * blkdev_close_zone - Close an open zone
+ * @bdev:	target block device
+ * @sector:	A sector of the zone to close or ~0ul for all zones.
+ * @gfp_mask:	memory allocation flags (for bio_alloc)
+ *
+ * Description:
+ *    Close a zone or all open zones.
+ */
+int blkdev_close_zone(struct block_device *bdev,
+		      sector_t sector, gfp_t gfp_mask)
+{
+	return blkdev_issue_zone_action(bdev, sector, REQ_OP_ZONE_CLOSE,
+					gfp_mask);
+}
+
+/**
+ * blkdev_finish_zone - Finish a zone (make it full)
+ * @bdev:	target block device
+ * @sector:	A sector of the zone to close or ~0ul for all zones.
+ * @gfp_mask:	memory allocation flags (for bio_alloc)
+ *
+ * Description:
+ *    Finish one zone or all possible zones.
+ */
+int blkdev_finish_zone(struct block_device *bdev,
+		       sector_t sector, gfp_t gfp_mask)
+{
+	return blkdev_issue_zone_action(bdev, sector, REQ_OP_ZONE_FINISH,
+					gfp_mask);
+}
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 1c74b19..1165594 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -24,6 +24,7 @@
 #include <linux/rcupdate.h>
 #include <linux/percpu-refcount.h>
 #include <linux/scatterlist.h>
+#include <linux/bit_spinlock.h>
 
 struct module;
 struct scsi_ioctl_command;
@@ -302,6 +303,113 @@ struct queue_limits {
 	unsigned char		zoned;
 };
 
+#ifdef CONFIG_BLK_DEV_ZONED
+
+enum blk_zone_type {
+	BLK_ZONE_TYPE_UNKNOWN,
+	BLK_ZONE_TYPE_CONVENTIONAL,
+	BLK_ZONE_TYPE_SEQWRITE_REQ,
+	BLK_ZONE_TYPE_SEQWRITE_PREF,
+};
+
+enum blk_zone_cond {
+	BLK_ZONE_COND_NO_WP,
+	BLK_ZONE_COND_EMPTY,
+	BLK_ZONE_COND_IMP_OPEN,
+	BLK_ZONE_COND_EXP_OPEN,
+	BLK_ZONE_COND_CLOSED,
+	BLK_ZONE_COND_READONLY = 0xd,
+	BLK_ZONE_COND_FULL,
+	BLK_ZONE_COND_OFFLINE,
+};
+
+enum blk_zone_flags {
+	BLK_ZONE_LOCKED,
+	BLK_ZONE_WRITE_LOCKED,
+	BLK_ZONE_IN_UPDATE,
+};
+
+/**
+ * Zone descriptor. On 64-bits architectures,
+ * this will align on sizeof(long), i.e. 64 B,
+ * and use 64 B.
+ */
+struct blk_zone {
+	struct rb_node	node;
+	unsigned long 	flags;
+	sector_t	len;
+	sector_t 	start;
+	sector_t 	wp;
+	unsigned int 	type : 4;
+	unsigned int	cond : 4;
+	unsigned int	non_seq : 1;
+	unsigned int	reset : 1;
+};
+
+#define blk_zone_is_seq_req(z)	((z)->type == BLK_ZONE_TYPE_SEQWRITE_REQ)
+#define blk_zone_is_seq_pref(z)	((z)->type == BLK_ZONE_TYPE_SEQWRITE_PREF)
+#define blk_zone_is_seq(z)	(blk_zone_is_seq_req(z) || blk_zone_is_seq_pref(z))
+#define blk_zone_is_conv(z) 	((z)->type == BLK_ZONE_TYPE_CONVENTIONAL)
+
+#define blk_zone_is_readonly(z)	((z)->cond == BLK_ZONE_COND_READONLY)
+#define blk_zone_is_offline(z) 	((z)->cond == BLK_ZONE_COND_OFFLINE)
+#define blk_zone_is_full(z)	((z)->cond == BLK_ZONE_COND_FULL)
+#define blk_zone_is_empty(z)	((z)->cond == BLK_ZONE_COND_EMPTY)
+#define blk_zone_is_open(z)	((z)->cond == BLK_ZONE_COND_EXP_OPEN)
+
+static inline void blk_lock_zone(struct blk_zone *zone)
+{
+	bit_spin_lock(BLK_ZONE_LOCKED, &zone->flags);
+}
+
+static inline int blk_trylock_zone(struct blk_zone *zone)
+{
+	return bit_spin_trylock(BLK_ZONE_LOCKED, &zone->flags);
+}
+
+static inline void blk_unlock_zone(struct blk_zone *zone)
+{
+	bit_spin_unlock(BLK_ZONE_LOCKED, &zone->flags);
+}
+
+static inline int blk_try_write_lock_zone(struct blk_zone *zone)
+{
+	return !test_and_set_bit(BLK_ZONE_WRITE_LOCKED, &zone->flags);
+}
+
+static inline void blk_write_unlock_zone(struct blk_zone *zone)
+{
+	clear_bit_unlock(BLK_ZONE_WRITE_LOCKED, &zone->flags);
+	smp_mb__after_atomic();
+}
+
+extern void blk_init_zones(struct request_queue *);
+extern void blk_drop_zones(struct request_queue *);
+extern struct blk_zone *blk_insert_zone(struct request_queue *,
+					struct blk_zone *);
+extern struct blk_zone *blk_lookup_zone(struct request_queue *, sector_t);
+
+extern int blkdev_update_zones(struct block_device *, gfp_t);
+extern void blk_wait_for_zone_update(struct blk_zone *);
+#define blk_zone_in_update(z)	test_bit(BLK_ZONE_IN_UPDATE, &(z)->flags)
+static inline void blk_clear_zone_update(struct blk_zone *zone)
+{
+	clear_bit_unlock(BLK_ZONE_IN_UPDATE, &zone->flags);
+	smp_mb__after_atomic();
+	wake_up_bit(&zone->flags, BLK_ZONE_IN_UPDATE);
+}
+
+extern struct blk_zone *blkdev_report_zone(struct block_device *,
+					   sector_t, bool, gfp_t);
+extern int blkdev_reset_zone(struct block_device *, sector_t, gfp_t);
+extern int blkdev_open_zone(struct block_device *, sector_t, gfp_t);
+extern int blkdev_close_zone(struct block_device *, sector_t, gfp_t);
+extern int blkdev_finish_zone(struct block_device *, sector_t, gfp_t);
+#else /* CONFIG_BLK_DEV_ZONED */
+static inline void blk_init_zones(struct request_queue *q) { };
+static inline void blk_drop_zones(struct request_queue *q) { };
+#endif /* CONFIG_BLK_DEV_ZONED */
+
 struct request_queue {
 	/*
 	 * Together with queue_head for cacheline sharing
@@ -404,6 +512,11 @@ struct request_queue {
 	unsigned int		nr_pending;
 #endif
 
+#ifdef CONFIG_BLK_DEV_ZONED
+	spinlock_t		zones_lock;
+	struct rb_root		zones;
+#endif
+
 	/*
 	 * queue settings
 	 */
-- 
2.7.4

Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice & Disclaimer:

This e-mail and any files transmitted with it may contain confidential or legally privileged information of WDC and/or its affiliates, and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system.

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 5/9] block: Implement support for zoned block devices
@ 2016-09-19 21:27   ` Damien Le Moal
  0 siblings, 0 replies; 36+ messages in thread
From: Damien Le Moal @ 2016-09-19 21:27 UTC (permalink / raw)
  To: linux-scsi, linux-block
  Cc: martin.petersen, axboe, hare, shaun.tancheff, Damien Le Moal

From: Hannes Reinecke <hare@suse.de>

Implement a RB-Tree holding a zoned block device zone information
(struct blk_zone) and add support functions for maintaining the
RB-Tree and manipulating zone structs. The block layer support does
not differentiate between host-aware and host-managed devices. The
different constraints for these different zone models are handled
by the generic SCSI layer sd driver down the stack.

Signed-off-by: Hannes Reinecke <hare@suse.de>

Changelog (Damien):
* Changed struct blk_zone to be more compact (64B)
* Changed zone locking to use bit_spin_lock in place of a regular
  spinlock
* Request zone operations to the underlying block device driver
  through BIO operations with the operation codes REQ_OP_ZONE_*.

Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
---
 block/Kconfig          |   8 ++
 block/Makefile         |   1 +
 block/blk-core.c       |   4 +
 block/blk-zoned.c      | 338 +++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/blkdev.h | 113 +++++++++++++++++
 5 files changed, 464 insertions(+)
 create mode 100644 block/blk-zoned.c

diff --git a/block/Kconfig b/block/Kconfig
index 161491d..c3a18f0 100644
--- a/block/Kconfig
+++ b/block/Kconfig
@@ -88,6 +88,14 @@ config BLK_DEV_INTEGRITY
 	T10/SCSI Data Integrity Field or the T13/ATA External Path
 	Protection.  If in doubt, say N.
 
+config BLK_DEV_ZONED
+	bool "Zoned block device support"
+	---help---
+	Block layer zoned block device support. This option enables
+	support for ZAC/ZBC host-managed and host-aware zoned block devices.
+
+	Say yes here if you have a ZAC or ZBC storage device.
+
 config BLK_DEV_THROTTLING
 	bool "Block layer bio throttling support"
 	depends on BLK_CGROUP=y
diff --git a/block/Makefile b/block/Makefile
index 9eda232..aee67fa 100644
--- a/block/Makefile
+++ b/block/Makefile
@@ -22,4 +22,5 @@ obj-$(CONFIG_IOSCHED_CFQ)	+= cfq-iosched.o
 obj-$(CONFIG_BLOCK_COMPAT)	+= compat_ioctl.o
 obj-$(CONFIG_BLK_CMDLINE_PARSER)	+= cmdline-parser.o
 obj-$(CONFIG_BLK_DEV_INTEGRITY) += bio-integrity.o blk-integrity.o t10-pi.o
+obj-$(CONFIG_BLK_DEV_ZONED)	+= blk-zoned.o
 
diff --git a/block/blk-core.c b/block/blk-core.c
index 4a7f7ba..2c5d069d 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -590,6 +590,8 @@ void blk_cleanup_queue(struct request_queue *q)
 		blk_mq_free_queue(q);
 	percpu_ref_exit(&q->q_usage_counter);
 
+	blk_drop_zones(q);
+
 	spin_lock_irq(lock);
 	if (q->queue_lock != &q->__queue_lock)
 		q->queue_lock = &q->__queue_lock;
@@ -728,6 +730,8 @@ struct request_queue *blk_alloc_queue_node(gfp_t gfp_mask, int node_id)
 #endif
 	INIT_DELAYED_WORK(&q->delay_work, blk_delay_work);
 
+	blk_init_zones(q);
+
 	kobject_init(&q->kobj, &blk_queue_ktype);
 
 	mutex_init(&q->sysfs_lock);
diff --git a/block/blk-zoned.c b/block/blk-zoned.c
new file mode 100644
index 0000000..a107940
--- /dev/null
+++ b/block/blk-zoned.c
@@ -0,0 +1,338 @@
+/*
+ * Zoned block device handling
+ *
+ * Copyright (c) 2015, Hannes Reinecke
+ * Copyright (c) 2015, SUSE Linux GmbH
+ *
+ * Copyright (c) 2016, Damien Le Moal
+ * Copyright (c) 2016, Western Digital
+ */
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/rbtree.h>
+#include <linux/blkdev.h>
+
+void blk_init_zones(struct request_queue *q)
+{
+	spin_lock_init(&q->zones_lock);
+	q->zones = RB_ROOT;
+}
+
+/**
+ * blk_drop_zones - Empty a zoned device zone tree.
+ * @q: queue of the zoned device to operate on
+ *
+ * Free all zone descriptors added to the queue zone tree.
+ */
+void blk_drop_zones(struct request_queue *q)
+{
+	struct rb_root *root = &q->zones;
+	struct blk_zone *zone, *next;
+
+	rbtree_postorder_for_each_entry_safe(zone, next, root, node)
+		kfree(zone);
+	q->zones = RB_ROOT;
+}
+EXPORT_SYMBOL_GPL(blk_drop_zones);
+
+/**
+ * blk_insert_zone - Add a new zone struct to the queue RB-tree.
+ * @q: queue of the zoned device to operate on
+ * @new_zone: The zone struct to add
+ *
+ * If @new_zone is not already added to the zone tree, add it.
+ * Otherwise, return the existing entry.
+ */
+struct blk_zone *blk_insert_zone(struct request_queue *q,
+				 struct blk_zone *new_zone)
+{
+	struct rb_root *root = &q->zones;
+	struct rb_node **new = &(root->rb_node), *parent = NULL;
+	struct blk_zone *zone = NULL;
+	unsigned long flags;
+
+	spin_lock_irqsave(&q->zones_lock, flags);
+
+	/* Figure out where to put new node */
+	while (*new) {
+		zone = container_of(*new, struct blk_zone, node);
+		parent = *new;
+		if (new_zone->start + new_zone->len <= zone->start)
+			new = &((*new)->rb_left);
+		else if (new_zone->start >= zone->start + zone->len)
+			new = &((*new)->rb_right);
+		else
+			/* Return existing zone */
+			break;
+		zone = NULL;
+	}
+
+	if (!zone) {
+		/* No existing zone: add new node and rebalance tree */
+		rb_link_node(&new_zone->node, parent, new);
+		rb_insert_color(&new_zone->node, root);
+	}
+
+	spin_unlock_irqrestore(&q->zones_lock, flags);
+
+	return zone;
+}
+EXPORT_SYMBOL_GPL(blk_insert_zone);
+
+/**
+ * blk_lookup_zone - Search a zone in a zoned device zone tree.
+ * @q: queue of the zoned device tree to search
+ * @sector: A sector within the zone to search for
+ *
+ * Search the zone containing @sector in the zone tree owned
+ * by @q. NULL is returned if the zone is not found. Since this
+ * can be called concurrently with blk_insert_zone during device
+ * initialization, the tree traversal is protected using the
+ * zones_lock of the queue.
+ */
+struct blk_zone *blk_lookup_zone(struct request_queue *q, sector_t sector)
+{
+	struct rb_root *root = &q->zones;
+	struct rb_node *node = root->rb_node;
+	struct blk_zone *zone = NULL;
+	unsigned long flags;
+
+	spin_lock_irqsave(&q->zones_lock, flags);
+
+	while (node) {
+		zone = container_of(node, struct blk_zone, node);
+		if (sector < zone->start)
+			node = node->rb_left;
+		else if (sector >= zone->start + zone->len)
+			node = node->rb_right;
+		else
+			break;
+		zone = NULL;
+	}
+
+	spin_unlock_irqrestore(&q->zones_lock, flags);
+
+	return zone;
+}
+EXPORT_SYMBOL_GPL(blk_lookup_zone);
+
+/**
+ * Execute a zone operation (REQ_OP_ZONE*)
+ */
+static int blkdev_issue_zone_operation(struct block_device *bdev,
+				       unsigned int op,
+				       sector_t sector, sector_t nr_sects,
+				       gfp_t gfp_mask)
+{
+	struct bio *bio;
+	int ret;
+
+	if (!bdev_zoned(bdev))
+		return -EOPNOTSUPP;
+
+	/*
+	 * Make sure bi_size does not overflow because
+	 * of some weird very large zone size.
+	 */
+	if (nr_sects && (unsigned long long)nr_sects << 9 > UINT_MAX)
+		return -EINVAL;
+
+	bio = bio_alloc(gfp_mask, 1);
+	if (!bio)
+		return -ENOMEM;
+
+	bio->bi_iter.bi_sector = sector;
+	bio->bi_iter.bi_size = nr_sects << 9;
+	bio->bi_vcnt = 0;
+	bio->bi_bdev = bdev;
+	bio_set_op_attrs(bio, op, 0);
+
+	ret = submit_bio_wait(bio);
+
+	bio_put(bio);
+
+	return ret;
+}
+
+/**
+ * blkdev_update_zones - Force an update of a device zone information
+ * @bdev:	Target block device
+ *
+ * Force an update of all zones information of @bdev. This call does not
+ * block waiting for the update to complete. On return, all zones are only
+ * marked as "in-update". Waiting on the zone update to complete can be done
+ * on a per zone basis using the function blk_wait_for_zone_update.
+ */
+int blkdev_update_zones(struct block_device *bdev,
+			gfp_t gfp_mask)
+{
+	return blkdev_issue_zone_operation(bdev, REQ_OP_ZONE_REPORT,
+					   0, 0, gfp_mask);
+}
+
+/*
+ * Wait for a zone update to complete.
+ */
+static void __blk_wait_for_zone_update(struct blk_zone *zone)
+{
+	might_sleep();
+	if (test_bit(BLK_ZONE_IN_UPDATE, &zone->flags))
+		wait_on_bit_io(&zone->flags, BLK_ZONE_IN_UPDATE,
+			       TASK_UNINTERRUPTIBLE);
+}
+
+/**
+ * blk_wait_for_zone_update - Wait for a zone information update
+ * @zone: The zone to wait for
+ *
+ * This must be called with the zone lock held. If @zone is not
+ * under update, returns immediately. Otherwise, wait for the
+ * update flag to be cleared on completion of the zone information
+ * update by the device driver.
+ */
+void blk_wait_for_zone_update(struct blk_zone *zone)
+{
+	WARN_ON_ONCE(!test_bit(BLK_ZONE_LOCKED, &zone->flags));
+	while (test_bit(BLK_ZONE_IN_UPDATE, &zone->flags)) {
+		blk_unlock_zone(zone);
+		__blk_wait_for_zone_update(zone);
+		blk_lock_zone(zone);
+	}
+}
+
+/**
+ * blkdev_report_zone - Get a zone information
+ * @bdev:	Target block device
+ * @sector:	A sector of the zone to report
+ * @update:	Force an update of the zone information
+ * @gfp_mask:	Memory allocation flags (for bio_alloc)
+ *
+ * Get a zone from the zone cache. And return it.
+ * If update is requested, issue a report zone operation
+ * and wait for the zone information to be updated.
+ */
+struct blk_zone *blkdev_report_zone(struct block_device *bdev,
+				    sector_t sector,
+				    bool update,
+				    gfp_t gfp_mask)
+{
+	struct request_queue *q = bdev_get_queue(bdev);
+	struct blk_zone *zone;
+	int ret;
+
+	zone = blk_lookup_zone(q, sector);
+	if (!zone)
+		return ERR_PTR(-ENXIO);
+
+	if (update) {
+		ret = blkdev_issue_zone_operation(bdev, REQ_OP_ZONE_REPORT,
+						  zone->start, zone->len,
+						  gfp_mask);
+		if (ret)
+			return ERR_PTR(ret);
+		__blk_wait_for_zone_update(zone);
+	}
+
+	return zone;
+}
+
+/**
+ * Execute a zone action (open, close, reset or finish).
+ */
+static int blkdev_issue_zone_action(struct block_device *bdev,
+				    sector_t sector, unsigned int op,
+				    gfp_t gfp_mask)
+{
+	struct request_queue *q = bdev_get_queue(bdev);
+	struct blk_zone *zone;
+	sector_t nr_sects;
+	int ret;
+
+	if (!blk_queue_zoned(q))
+		return -EOPNOTSUPP;
+
+	if (sector == ~0ULL) {
+		/* All zones */
+		sector = 0;
+		nr_sects = 0;
+	} else {
+		/* This zone */
+		zone = blk_lookup_zone(q, sector);
+		if (!zone)
+			return -ENXIO;
+		sector = zone->start;
+		nr_sects = zone->len;
+	}
+
+	ret = blkdev_issue_zone_operation(bdev, op, sector,
+					  nr_sects, gfp_mask);
+	if (ret == 0 && !nr_sects)
+		blkdev_update_zones(bdev, gfp_mask);
+
+	return ret;
+}
+
+/**
+ * blkdev_reset_zone - Reset a zone write pointer
+ * @bdev:	target block device
+ * @sector:	A sector of the zone to reset or ~0ul for all zones.
+ * @gfp_mask:	memory allocation flags (for bio_alloc)
+ *
+ * Description:
+ *    Reset a zone or all zones write pointer.
+ */
+int blkdev_reset_zone(struct block_device *bdev,
+		      sector_t sector, gfp_t gfp_mask)
+{
+	return blkdev_issue_zone_action(bdev, sector, REQ_OP_ZONE_RESET,
+					gfp_mask);
+}
+
+/**
+ * blkdev_open_zone - Explicitely open a zone
+ * @bdev:	target block device
+ * @sector:	A sector of the zone to open or ~0ul for all zones.
+ * @gfp_mask:	memory allocation flags (for bio_alloc)
+ *
+ * Description:
+ *    Open a zone or all possible zones.
+ */
+int blkdev_open_zone(struct block_device *bdev,
+		     sector_t sector, gfp_t gfp_mask)
+{
+	return blkdev_issue_zone_action(bdev, sector, REQ_OP_ZONE_OPEN,
+					gfp_mask);
+}
+
+/**
+ * blkdev_close_zone - Close an open zone
+ * @bdev:	target block device
+ * @sector:	A sector of the zone to close or ~0ul for all zones.
+ * @gfp_mask:	memory allocation flags (for bio_alloc)
+ *
+ * Description:
+ *    Close a zone or all open zones.
+ */
+int blkdev_close_zone(struct block_device *bdev,
+		      sector_t sector, gfp_t gfp_mask)
+{
+	return blkdev_issue_zone_action(bdev, sector, REQ_OP_ZONE_CLOSE,
+					gfp_mask);
+}
+
+/**
+ * blkdev_finish_zone - Finish a zone (make it full)
+ * @bdev:	target block device
+ * @sector:	A sector of the zone to close or ~0ul for all zones.
+ * @gfp_mask:	memory allocation flags (for bio_alloc)
+ *
+ * Description:
+ *    Finish one zone or all possible zones.
+ */
+int blkdev_finish_zone(struct block_device *bdev,
+		       sector_t sector, gfp_t gfp_mask)
+{
+	return blkdev_issue_zone_action(bdev, sector, REQ_OP_ZONE_FINISH,
+					gfp_mask);
+}
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 1c74b19..1165594 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -24,6 +24,7 @@
 #include <linux/rcupdate.h>
 #include <linux/percpu-refcount.h>
 #include <linux/scatterlist.h>
+#include <linux/bit_spinlock.h>
 
 struct module;
 struct scsi_ioctl_command;
@@ -302,6 +303,113 @@ struct queue_limits {
 	unsigned char		zoned;
 };
 
+#ifdef CONFIG_BLK_DEV_ZONED
+
+enum blk_zone_type {
+	BLK_ZONE_TYPE_UNKNOWN,
+	BLK_ZONE_TYPE_CONVENTIONAL,
+	BLK_ZONE_TYPE_SEQWRITE_REQ,
+	BLK_ZONE_TYPE_SEQWRITE_PREF,
+};
+
+enum blk_zone_cond {
+	BLK_ZONE_COND_NO_WP,
+	BLK_ZONE_COND_EMPTY,
+	BLK_ZONE_COND_IMP_OPEN,
+	BLK_ZONE_COND_EXP_OPEN,
+	BLK_ZONE_COND_CLOSED,
+	BLK_ZONE_COND_READONLY = 0xd,
+	BLK_ZONE_COND_FULL,
+	BLK_ZONE_COND_OFFLINE,
+};
+
+enum blk_zone_flags {
+	BLK_ZONE_LOCKED,
+	BLK_ZONE_WRITE_LOCKED,
+	BLK_ZONE_IN_UPDATE,
+};
+
+/**
+ * Zone descriptor. On 64-bits architectures,
+ * this will align on sizeof(long), i.e. 64 B,
+ * and use 64 B.
+ */
+struct blk_zone {
+	struct rb_node	node;
+	unsigned long 	flags;
+	sector_t	len;
+	sector_t 	start;
+	sector_t 	wp;
+	unsigned int 	type : 4;
+	unsigned int	cond : 4;
+	unsigned int	non_seq : 1;
+	unsigned int	reset : 1;
+};
+
+#define blk_zone_is_seq_req(z)	((z)->type == BLK_ZONE_TYPE_SEQWRITE_REQ)
+#define blk_zone_is_seq_pref(z)	((z)->type == BLK_ZONE_TYPE_SEQWRITE_PREF)
+#define blk_zone_is_seq(z)	(blk_zone_is_seq_req(z) || blk_zone_is_seq_pref(z))
+#define blk_zone_is_conv(z) 	((z)->type == BLK_ZONE_TYPE_CONVENTIONAL)
+
+#define blk_zone_is_readonly(z)	((z)->cond == BLK_ZONE_COND_READONLY)
+#define blk_zone_is_offline(z) 	((z)->cond == BLK_ZONE_COND_OFFLINE)
+#define blk_zone_is_full(z)	((z)->cond == BLK_ZONE_COND_FULL)
+#define blk_zone_is_empty(z)	((z)->cond == BLK_ZONE_COND_EMPTY)
+#define blk_zone_is_open(z)	((z)->cond == BLK_ZONE_COND_EXP_OPEN)
+
+static inline void blk_lock_zone(struct blk_zone *zone)
+{
+	bit_spin_lock(BLK_ZONE_LOCKED, &zone->flags);
+}
+
+static inline int blk_trylock_zone(struct blk_zone *zone)
+{
+	return bit_spin_trylock(BLK_ZONE_LOCKED, &zone->flags);
+}
+
+static inline void blk_unlock_zone(struct blk_zone *zone)
+{
+	bit_spin_unlock(BLK_ZONE_LOCKED, &zone->flags);
+}
+
+static inline int blk_try_write_lock_zone(struct blk_zone *zone)
+{
+	return !test_and_set_bit(BLK_ZONE_WRITE_LOCKED, &zone->flags);
+}
+
+static inline void blk_write_unlock_zone(struct blk_zone *zone)
+{
+	clear_bit_unlock(BLK_ZONE_WRITE_LOCKED, &zone->flags);
+	smp_mb__after_atomic();
+}
+
+extern void blk_init_zones(struct request_queue *);
+extern void blk_drop_zones(struct request_queue *);
+extern struct blk_zone *blk_insert_zone(struct request_queue *,
+					struct blk_zone *);
+extern struct blk_zone *blk_lookup_zone(struct request_queue *, sector_t);
+
+extern int blkdev_update_zones(struct block_device *, gfp_t);
+extern void blk_wait_for_zone_update(struct blk_zone *);
+#define blk_zone_in_update(z)	test_bit(BLK_ZONE_IN_UPDATE, &(z)->flags)
+static inline void blk_clear_zone_update(struct blk_zone *zone)
+{
+	clear_bit_unlock(BLK_ZONE_IN_UPDATE, &zone->flags);
+	smp_mb__after_atomic();
+	wake_up_bit(&zone->flags, BLK_ZONE_IN_UPDATE);
+}
+
+extern struct blk_zone *blkdev_report_zone(struct block_device *,
+					   sector_t, bool, gfp_t);
+extern int blkdev_reset_zone(struct block_device *, sector_t, gfp_t);
+extern int blkdev_open_zone(struct block_device *, sector_t, gfp_t);
+extern int blkdev_close_zone(struct block_device *, sector_t, gfp_t);
+extern int blkdev_finish_zone(struct block_device *, sector_t, gfp_t);
+#else /* CONFIG_BLK_DEV_ZONED */
+static inline void blk_init_zones(struct request_queue *q) { };
+static inline void blk_drop_zones(struct request_queue *q) { };
+#endif /* CONFIG_BLK_DEV_ZONED */
+
 struct request_queue {
 	/*
 	 * Together with queue_head for cacheline sharing
@@ -404,6 +512,11 @@ struct request_queue {
 	unsigned int		nr_pending;
 #endif
 
+#ifdef CONFIG_BLK_DEV_ZONED
+	spinlock_t		zones_lock;
+	struct rb_root		zones;
+#endif
+
 	/*
 	 * queue settings
 	 */
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 6/9] block: Add 'BLKPREP_DONE' return value
  2016-09-19 21:27 ` Damien Le Moal
@ 2016-09-19 21:27   ` Damien Le Moal
  -1 siblings, 0 replies; 36+ messages in thread
From: Damien Le Moal @ 2016-09-19 21:27 UTC (permalink / raw)
  To: linux-scsi, linux-block
  Cc: martin.petersen, axboe, hare, shaun.tancheff, Damien Le Moal

From: Hannes Reinecke <hare@suse.de>

Add a new blkprep return code BLKPREP_DONE to signal completion
without I/O error.

Signed-off-by: Hannes Reinecke <hare@suse.de>

Changelog (Damien):
Rewrite adding blk_prep_end_request as suggested by Christoph Hellwig

Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
---
 block/blk-core.c        | 42 ++++++++++++++++++++++++++----------------
 drivers/scsi/scsi_lib.c |  1 +
 include/linux/blkdev.h  |  1 +
 3 files changed, 28 insertions(+), 16 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 2c5d069d..8dbbb1a 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -2341,6 +2341,17 @@ void blk_account_io_start(struct request *rq, bool new_io)
 	part_stat_unlock();
 }
 
+static void blk_prep_end_request(struct request *rq, int error)
+{
+	/*
+	 * Mark this request as started so we don't trigger
+	 * any debug logic in the end I/O path.
+	 */
+        rq->cmd_flags |= REQ_QUIET;
+        blk_start_request(rq);
+        __blk_end_request_all(rq, error);
+}
+
 /**
  * blk_peek_request - peek at the top of a request queue
  * @q: request queue to peek at
@@ -2408,9 +2419,10 @@ struct request *blk_peek_request(struct request_queue *q)
 			break;
 
 		ret = q->prep_rq_fn(q, rq);
-		if (ret == BLKPREP_OK) {
-			break;
-		} else if (ret == BLKPREP_DEFER) {
+		switch(ret) {
+		case BLKPREP_OK:
+			goto out;
+		case BLKPREP_DEFER:
 			/*
 			 * the request may have been (partially) prepped.
 			 * we need to keep this request in the front to
@@ -2425,25 +2437,23 @@ struct request *blk_peek_request(struct request_queue *q)
 				 */
 				--rq->nr_phys_segments;
 			}
-
 			rq = NULL;
+			goto out;
+		case BLKPREP_KILL:
+			blk_prep_end_request(rq, -EIO);
 			break;
-		} else if (ret == BLKPREP_KILL || ret == BLKPREP_INVALID) {
-			int err = (ret == BLKPREP_INVALID) ? -EREMOTEIO : -EIO;
-
-			rq->cmd_flags |= REQ_QUIET;
-			/*
-			 * Mark this request as started so we don't trigger
-			 * any debug logic in the end I/O path.
-			 */
-			blk_start_request(rq);
-			__blk_end_request_all(rq, err);
-		} else {
+		case BLKPREP_INVALID:
+			blk_prep_end_request(rq, -EREMOTEIO);
+			break;
+		case BLKPREP_DONE:
+			blk_prep_end_request(rq, 0);
+			break;
+		default:
 			printk(KERN_ERR "%s: bad return=%d\n", __func__, ret);
 			break;
 		}
 	}
-
+out:
 	return rq;
 }
 EXPORT_SYMBOL(blk_peek_request);
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index c71344a..f99504d 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1260,6 +1260,7 @@ scsi_prep_return(struct request_queue *q, struct request *req, int ret)
 	case BLKPREP_KILL:
 	case BLKPREP_INVALID:
 		req->errors = DID_NO_CONNECT << 16;
+	case BLKPREP_DONE:
 		/* release the command and kill it */
 		if (req->special) {
 			struct scsi_cmnd *cmd = req->special;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 1165594..a85f95b 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -819,6 +819,7 @@ enum {
 	BLKPREP_KILL,		/* fatal error, kill, return -EIO */
 	BLKPREP_DEFER,		/* leave on queue */
 	BLKPREP_INVALID,	/* invalid command, kill, return -EREMOTEIO */
+	BLKPREP_DONE,           /* complete w/o error */
 };
 
 extern unsigned long blk_max_low_pfn, blk_max_pfn;
-- 
2.7.4

Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice & Disclaimer:

This e-mail and any files transmitted with it may contain confidential or legally privileged information of WDC and/or its affiliates, and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system.

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 6/9] block: Add 'BLKPREP_DONE' return value
@ 2016-09-19 21:27   ` Damien Le Moal
  0 siblings, 0 replies; 36+ messages in thread
From: Damien Le Moal @ 2016-09-19 21:27 UTC (permalink / raw)
  To: linux-scsi, linux-block
  Cc: martin.petersen, axboe, hare, shaun.tancheff, Damien Le Moal

From: Hannes Reinecke <hare@suse.de>

Add a new blkprep return code BLKPREP_DONE to signal completion
without I/O error.

Signed-off-by: Hannes Reinecke <hare@suse.de>

Changelog (Damien):
Rewrite adding blk_prep_end_request as suggested by Christoph Hellwig

Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
---
 block/blk-core.c        | 42 ++++++++++++++++++++++++++----------------
 drivers/scsi/scsi_lib.c |  1 +
 include/linux/blkdev.h  |  1 +
 3 files changed, 28 insertions(+), 16 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 2c5d069d..8dbbb1a 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -2341,6 +2341,17 @@ void blk_account_io_start(struct request *rq, bool new_io)
 	part_stat_unlock();
 }
 
+static void blk_prep_end_request(struct request *rq, int error)
+{
+	/*
+	 * Mark this request as started so we don't trigger
+	 * any debug logic in the end I/O path.
+	 */
+        rq->cmd_flags |= REQ_QUIET;
+        blk_start_request(rq);
+        __blk_end_request_all(rq, error);
+}
+
 /**
  * blk_peek_request - peek at the top of a request queue
  * @q: request queue to peek at
@@ -2408,9 +2419,10 @@ struct request *blk_peek_request(struct request_queue *q)
 			break;
 
 		ret = q->prep_rq_fn(q, rq);
-		if (ret == BLKPREP_OK) {
-			break;
-		} else if (ret == BLKPREP_DEFER) {
+		switch(ret) {
+		case BLKPREP_OK:
+			goto out;
+		case BLKPREP_DEFER:
 			/*
 			 * the request may have been (partially) prepped.
 			 * we need to keep this request in the front to
@@ -2425,25 +2437,23 @@ struct request *blk_peek_request(struct request_queue *q)
 				 */
 				--rq->nr_phys_segments;
 			}
-
 			rq = NULL;
+			goto out;
+		case BLKPREP_KILL:
+			blk_prep_end_request(rq, -EIO);
 			break;
-		} else if (ret == BLKPREP_KILL || ret == BLKPREP_INVALID) {
-			int err = (ret == BLKPREP_INVALID) ? -EREMOTEIO : -EIO;
-
-			rq->cmd_flags |= REQ_QUIET;
-			/*
-			 * Mark this request as started so we don't trigger
-			 * any debug logic in the end I/O path.
-			 */
-			blk_start_request(rq);
-			__blk_end_request_all(rq, err);
-		} else {
+		case BLKPREP_INVALID:
+			blk_prep_end_request(rq, -EREMOTEIO);
+			break;
+		case BLKPREP_DONE:
+			blk_prep_end_request(rq, 0);
+			break;
+		default:
 			printk(KERN_ERR "%s: bad return=%d\n", __func__, ret);
 			break;
 		}
 	}
-
+out:
 	return rq;
 }
 EXPORT_SYMBOL(blk_peek_request);
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index c71344a..f99504d 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1260,6 +1260,7 @@ scsi_prep_return(struct request_queue *q, struct request *req, int ret)
 	case BLKPREP_KILL:
 	case BLKPREP_INVALID:
 		req->errors = DID_NO_CONNECT << 16;
+	case BLKPREP_DONE:
 		/* release the command and kill it */
 		if (req->special) {
 			struct scsi_cmnd *cmd = req->special;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 1165594..a85f95b 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -819,6 +819,7 @@ enum {
 	BLKPREP_KILL,		/* fatal error, kill, return -EIO */
 	BLKPREP_DEFER,		/* leave on queue */
 	BLKPREP_INVALID,	/* invalid command, kill, return -EREMOTEIO */
+	BLKPREP_DONE,           /* complete w/o error */
 };
 
 extern unsigned long blk_max_low_pfn, blk_max_pfn;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 7/9] block: Add 'BLK_MQ_RQ_QUEUE_DONE' return value
  2016-09-19 21:27 ` Damien Le Moal
@ 2016-09-19 21:27   ` Damien Le Moal
  -1 siblings, 0 replies; 36+ messages in thread
From: Damien Le Moal @ 2016-09-19 21:27 UTC (permalink / raw)
  To: linux-scsi, linux-block
  Cc: martin.petersen, axboe, hare, shaun.tancheff, Hannes Reinecke,
	Damien Le Moal

From: Hannes Reinecke <hare@suse.de>

Add a return value BLK_MQ_RQ_QUEUE_DONE to terminate a request
without error.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
---
 block/blk-mq.c          | 1 +
 drivers/scsi/scsi_lib.c | 3 +++
 include/linux/blk-mq.h  | 1 +
 3 files changed, 5 insertions(+)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 13f5a6c..6300629 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -851,6 +851,7 @@ static void __blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx)
 			pr_err("blk-mq: bad return on queue: %d\n", ret);
 		case BLK_MQ_RQ_QUEUE_ERROR:
 			rq->errors = -EIO;
+		case BLK_MQ_RQ_QUEUE_DONE:
 			blk_mq_end_request(rq, rq->errors);
 			break;
 		}
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index f99504d..793b791 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1805,6 +1805,8 @@ static inline int prep_to_mq(int ret)
 		return 0;
 	case BLKPREP_DEFER:
 		return BLK_MQ_RQ_QUEUE_BUSY;
+	case BLKPREP_DONE:
+		return BLK_MQ_RQ_QUEUE_DONE;
 	default:
 		return BLK_MQ_RQ_QUEUE_ERROR;
 	}
@@ -1948,6 +1950,7 @@ out:
 			blk_mq_delay_queue(hctx, SCSI_QUEUE_DELAY);
 		break;
 	case BLK_MQ_RQ_QUEUE_ERROR:
+	case BLK_MQ_RQ_QUEUE_DONE:
 		/*
 		 * Make sure to release all allocated ressources when
 		 * we hit an error, as we will never see this command
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index e43bbff..07b4888 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -153,6 +153,7 @@ enum {
 	BLK_MQ_RQ_QUEUE_OK	= 0,	/* queued fine */
 	BLK_MQ_RQ_QUEUE_BUSY	= 1,	/* requeue IO for later */
 	BLK_MQ_RQ_QUEUE_ERROR	= 2,	/* end IO with error */
+	BLK_MQ_RQ_QUEUE_DONE	= 3,	/* end IO w/o error */
 
 	BLK_MQ_F_SHOULD_MERGE	= 1 << 0,
 	BLK_MQ_F_TAG_SHARED	= 1 << 1,
-- 
2.7.4

Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice & Disclaimer:

This e-mail and any files transmitted with it may contain confidential or legally privileged information of WDC and/or its affiliates, and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system.

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 7/9] block: Add 'BLK_MQ_RQ_QUEUE_DONE' return value
@ 2016-09-19 21:27   ` Damien Le Moal
  0 siblings, 0 replies; 36+ messages in thread
From: Damien Le Moal @ 2016-09-19 21:27 UTC (permalink / raw)
  To: linux-scsi, linux-block
  Cc: martin.petersen, axboe, hare, shaun.tancheff, Hannes Reinecke,
	Damien Le Moal

From: Hannes Reinecke <hare@suse.de>

Add a return value BLK_MQ_RQ_QUEUE_DONE to terminate a request
without error.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
---
 block/blk-mq.c          | 1 +
 drivers/scsi/scsi_lib.c | 3 +++
 include/linux/blk-mq.h  | 1 +
 3 files changed, 5 insertions(+)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 13f5a6c..6300629 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -851,6 +851,7 @@ static void __blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx)
 			pr_err("blk-mq: bad return on queue: %d\n", ret);
 		case BLK_MQ_RQ_QUEUE_ERROR:
 			rq->errors = -EIO;
+		case BLK_MQ_RQ_QUEUE_DONE:
 			blk_mq_end_request(rq, rq->errors);
 			break;
 		}
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index f99504d..793b791 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1805,6 +1805,8 @@ static inline int prep_to_mq(int ret)
 		return 0;
 	case BLKPREP_DEFER:
 		return BLK_MQ_RQ_QUEUE_BUSY;
+	case BLKPREP_DONE:
+		return BLK_MQ_RQ_QUEUE_DONE;
 	default:
 		return BLK_MQ_RQ_QUEUE_ERROR;
 	}
@@ -1948,6 +1950,7 @@ out:
 			blk_mq_delay_queue(hctx, SCSI_QUEUE_DELAY);
 		break;
 	case BLK_MQ_RQ_QUEUE_ERROR:
+	case BLK_MQ_RQ_QUEUE_DONE:
 		/*
 		 * Make sure to release all allocated ressources when
 		 * we hit an error, as we will never see this command
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index e43bbff..07b4888 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -153,6 +153,7 @@ enum {
 	BLK_MQ_RQ_QUEUE_OK	= 0,	/* queued fine */
 	BLK_MQ_RQ_QUEUE_BUSY	= 1,	/* requeue IO for later */
 	BLK_MQ_RQ_QUEUE_ERROR	= 2,	/* end IO with error */
+	BLK_MQ_RQ_QUEUE_DONE	= 3,	/* end IO w/o error */
 
 	BLK_MQ_F_SHOULD_MERGE	= 1 << 0,
 	BLK_MQ_F_TAG_SHARED	= 1 << 1,
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 8/9] sd: Implement support for ZBC devices
  2016-09-19 21:27 ` Damien Le Moal
@ 2016-09-19 21:27   ` Damien Le Moal
  -1 siblings, 0 replies; 36+ messages in thread
From: Damien Le Moal @ 2016-09-19 21:27 UTC (permalink / raw)
  To: linux-scsi, linux-block
  Cc: martin.petersen, axboe, hare, shaun.tancheff, Hannes Reinecke,
	Damien Le Moal

From: Hannes Reinecke <hare@suse.com>

Implement ZBC support functions to setup zoned disks and fill the
block device zone information tree during the device scan. The
zone information tree is also always updated on disk revalidation.
This adds support for the REQ_OP_ZONE* operations and also implements
the new RESET_WP provisioning mode so that discard requests can be
mapped to the RESET WRITE POINTER command for devices with a constant
zone size.

The capacity read of the device triggers the zone information read
for zoned block devices. As this needs the device zone model, the
the call to sd_read_capacity is moved after the call to
sd_read_block_characteristics so that host-aware devices are
properlly initialized. The call to sd_zbc_read_zones in
sd_read_capacity may change the device capacity obtained with
the sd_read_capacity_16 function for devices reporting only the
capacity of conventional zones at the beginning of the LBA range
(i.e. devices with rc_basis et to 0).

Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
---
 drivers/scsi/Makefile     |    1 +
 drivers/scsi/sd.c         |  147 ++++--
 drivers/scsi/sd.h         |   68 +++
 drivers/scsi/sd_zbc.c     | 1097 +++++++++++++++++++++++++++++++++++++++++++++
 include/scsi/scsi_proto.h |   17 +
 5 files changed, 1304 insertions(+), 26 deletions(-)
 create mode 100644 drivers/scsi/sd_zbc.c

diff --git a/drivers/scsi/Makefile b/drivers/scsi/Makefile
index d539798..fabcb6d 100644
--- a/drivers/scsi/Makefile
+++ b/drivers/scsi/Makefile
@@ -179,6 +179,7 @@ hv_storvsc-y			:= storvsc_drv.o
 
 sd_mod-objs	:= sd.o
 sd_mod-$(CONFIG_BLK_DEV_INTEGRITY) += sd_dif.o
+sd_mod-$(CONFIG_BLK_DEV_ZONED) += sd_zbc.o
 
 sr_mod-objs	:= sr.o sr_ioctl.o sr_vendor.o
 ncr53c8xx-flags-$(CONFIG_SCSI_ZALON) \
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index d3e852a..46b8b78 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -92,6 +92,7 @@ MODULE_ALIAS_BLOCKDEV_MAJOR(SCSI_DISK15_MAJOR);
 MODULE_ALIAS_SCSI_DEVICE(TYPE_DISK);
 MODULE_ALIAS_SCSI_DEVICE(TYPE_MOD);
 MODULE_ALIAS_SCSI_DEVICE(TYPE_RBC);
+MODULE_ALIAS_SCSI_DEVICE(TYPE_ZBC);
 
 #if !defined(CONFIG_DEBUG_BLOCK_EXT_DEVT)
 #define SD_MINORS	16
@@ -99,7 +100,6 @@ MODULE_ALIAS_SCSI_DEVICE(TYPE_RBC);
 #define SD_MINORS	0
 #endif
 
-static void sd_config_discard(struct scsi_disk *, unsigned int);
 static void sd_config_write_same(struct scsi_disk *);
 static int  sd_revalidate_disk(struct gendisk *);
 static void sd_unlock_native_capacity(struct gendisk *disk);
@@ -162,7 +162,7 @@ cache_type_store(struct device *dev, struct device_attribute *attr,
 	static const char temp[] = "temporary ";
 	int len;
 
-	if (sdp->type != TYPE_DISK)
+	if (sdp->type != TYPE_DISK && sdp->type != TYPE_ZBC)
 		/* no cache control on RBC devices; theoretically they
 		 * can do it, but there's probably so many exceptions
 		 * it's not worth the risk */
@@ -261,7 +261,7 @@ allow_restart_store(struct device *dev, struct device_attribute *attr,
 	if (!capable(CAP_SYS_ADMIN))
 		return -EACCES;
 
-	if (sdp->type != TYPE_DISK)
+	if (sdp->type != TYPE_DISK && sdp->type != TYPE_ZBC)
 		return -EINVAL;
 
 	sdp->allow_restart = simple_strtoul(buf, NULL, 10);
@@ -369,6 +369,7 @@ static const char *lbp_mode[] = {
 	[SD_LBP_WS16]		= "writesame_16",
 	[SD_LBP_WS10]		= "writesame_10",
 	[SD_LBP_ZERO]		= "writesame_zero",
+	[SD_ZBC_RESET_WP]	= "reset_wp",
 	[SD_LBP_DISABLE]	= "disabled",
 };
 
@@ -391,6 +392,13 @@ provisioning_mode_store(struct device *dev, struct device_attribute *attr,
 	if (!capable(CAP_SYS_ADMIN))
 		return -EACCES;
 
+	if (sdkp->zoned == 1 || sdp->type == TYPE_ZBC) {
+		if (!strncmp(buf, lbp_mode[SD_ZBC_RESET_WP], 20)) {
+			sd_config_discard(sdkp, SD_ZBC_RESET_WP);
+			return count;
+		}
+		return -EINVAL;
+	}
 	if (sdp->type != TYPE_DISK)
 		return -EINVAL;
 
@@ -458,7 +466,7 @@ max_write_same_blocks_store(struct device *dev, struct device_attribute *attr,
 	if (!capable(CAP_SYS_ADMIN))
 		return -EACCES;
 
-	if (sdp->type != TYPE_DISK)
+	if (sdp->type != TYPE_DISK && sdp->type != TYPE_ZBC)
 		return -EINVAL;
 
 	err = kstrtoul(buf, 10, &max);
@@ -631,7 +639,7 @@ static unsigned char sd_setup_protect_cmnd(struct scsi_cmnd *scmd,
 	return protect;
 }
 
-static void sd_config_discard(struct scsi_disk *sdkp, unsigned int mode)
+void sd_config_discard(struct scsi_disk *sdkp, unsigned int mode)
 {
 	struct request_queue *q = sdkp->disk->queue;
 	unsigned int logical_block_size = sdkp->device->sector_size;
@@ -683,6 +691,11 @@ static void sd_config_discard(struct scsi_disk *sdkp, unsigned int mode)
 		q->limits.discard_zeroes_data = sdkp->lbprz;
 		break;
 
+	case SD_ZBC_RESET_WP:
+		max_blocks = min_not_zero(sdkp->max_unmap_blocks,
+					  (u32)SD_MAX_WS16_BLOCKS);
+		break;
+
 	case SD_LBP_ZERO:
 		max_blocks = min_not_zero(sdkp->max_ws_blocks,
 					  (u32)SD_MAX_WS10_BLOCKS);
@@ -711,16 +724,20 @@ static int sd_setup_discard_cmnd(struct scsi_cmnd *cmd)
 	unsigned int nr_sectors = blk_rq_sectors(rq);
 	unsigned int nr_bytes = blk_rq_bytes(rq);
 	unsigned int len;
-	int ret;
+	int ret = BLKPREP_OK;
 	char *buf;
-	struct page *page;
+	struct page *page = NULL;
 
 	sector >>= ilog2(sdp->sector_size) - 9;
 	nr_sectors >>= ilog2(sdp->sector_size) - 9;
 
-	page = alloc_page(GFP_ATOMIC | __GFP_ZERO);
-	if (!page)
-		return BLKPREP_DEFER;
+	if (sdkp->provisioning_mode != SD_ZBC_RESET_WP) {
+		page = alloc_page(GFP_ATOMIC | __GFP_ZERO);
+		if (!page)
+			return BLKPREP_DEFER;
+	}
+
+	rq->completion_data = page;
 
 	switch (sdkp->provisioning_mode) {
 	case SD_LBP_UNMAP:
@@ -760,12 +777,19 @@ static int sd_setup_discard_cmnd(struct scsi_cmnd *cmd)
 		len = sdkp->device->sector_size;
 		break;
 
+	case SD_ZBC_RESET_WP:
+		ret = sd_zbc_setup_reset_cmnd(cmd);
+		if (ret != BLKPREP_OK)
+			goto out;
+		/* Reset Write Pointer doesn't have a payload */
+		len = 0;
+		break;
+
 	default:
 		ret = BLKPREP_INVALID;
 		goto out;
 	}
 
-	rq->completion_data = page;
 	rq->timeout = SD_TIMEOUT;
 
 	cmd->transfersize = len;
@@ -779,13 +803,17 @@ static int sd_setup_discard_cmnd(struct scsi_cmnd *cmd)
 	 * discarded on disk. This allows us to report completion on the full
 	 * amount of blocks described by the request.
 	 */
-	blk_add_request_payload(rq, page, 0, len);
-	ret = scsi_init_io(cmd);
+	if (len) {
+		blk_add_request_payload(rq, page, 0, len);
+		ret = scsi_init_io(cmd);
+	}
 	rq->__data_len = nr_bytes;
 
 out:
-	if (ret != BLKPREP_OK)
+	if (page && ret != BLKPREP_OK) {
+		rq->completion_data = NULL;
 		__free_page(page);
+	}
 	return ret;
 }
 
@@ -843,6 +871,13 @@ static int sd_setup_write_same_cmnd(struct scsi_cmnd *cmd)
 
 	BUG_ON(bio_offset(bio) || bio_iovec(bio).bv_len != sdp->sector_size);
 
+	if (sdkp->zoned == 1 || sdp->type == TYPE_ZBC) {
+		/* sd_zbc_setup_read_write uses block layer sector units */
+		ret = sd_zbc_setup_read_write(sdkp, rq, sector, &nr_sectors);
+		if (ret != BLKPREP_OK)
+			return ret;
+	}
+
 	sector >>= ilog2(sdp->sector_size) - 9;
 	nr_sectors >>= ilog2(sdp->sector_size) - 9;
 
@@ -962,6 +997,13 @@ static int sd_setup_read_write_cmnd(struct scsi_cmnd *SCpnt)
 	SCSI_LOG_HLQUEUE(2, scmd_printk(KERN_INFO, SCpnt, "block=%llu\n",
 					(unsigned long long)block));
 
+	if (sdkp->zoned == 1 || sdp->type == TYPE_ZBC) {
+		/* sd_zbc_setup_read_write uses block layer sector units */
+		ret = sd_zbc_setup_read_write(sdkp, rq, block, &this_count);
+		if (ret != BLKPREP_OK)
+			goto out;
+	}
+
 	/*
 	 * If we have a 1K hardware sectorsize, prevent access to single
 	 * 512 byte sectors.  In theory we could handle this - in fact
@@ -1148,6 +1190,16 @@ static int sd_init_command(struct scsi_cmnd *cmd)
 	case REQ_OP_READ:
 	case REQ_OP_WRITE:
 		return sd_setup_read_write_cmnd(cmd);
+	case REQ_OP_ZONE_REPORT:
+		return sd_zbc_setup_report_cmnd(cmd);
+	case REQ_OP_ZONE_RESET:
+		return sd_zbc_setup_reset_cmnd(cmd);
+	case REQ_OP_ZONE_OPEN:
+		return sd_zbc_setup_open_cmnd(cmd);
+	case REQ_OP_ZONE_CLOSE:
+		return sd_zbc_setup_close_cmnd(cmd);
+	case REQ_OP_ZONE_FINISH:
+		return sd_zbc_setup_finish_cmnd(cmd);
 	default:
 		BUG();
 	}
@@ -1157,7 +1209,8 @@ static void sd_uninit_command(struct scsi_cmnd *SCpnt)
 {
 	struct request *rq = SCpnt->request;
 
-	if (req_op(rq) == REQ_OP_DISCARD)
+	if (req_op(rq) == REQ_OP_DISCARD &&
+	    rq->completion_data)
 		__free_page(rq->completion_data);
 
 	if (SCpnt->cmnd != rq->cmd) {
@@ -1778,8 +1831,16 @@ static int sd_done(struct scsi_cmnd *SCpnt)
 	int sense_deferred = 0;
 	unsigned char op = SCpnt->cmnd[0];
 	unsigned char unmap = SCpnt->cmnd[1] & 8;
+	unsigned char sa = SCpnt->cmnd[1] & 0xf;
 
-	if (req_op(req) == REQ_OP_DISCARD || req_op(req) == REQ_OP_WRITE_SAME) {
+	switch(req_op(req)) {
+	case REQ_OP_DISCARD:
+	case REQ_OP_WRITE_SAME:
+	case REQ_OP_ZONE_REPORT:
+	case REQ_OP_ZONE_RESET:
+	case REQ_OP_ZONE_OPEN:
+	case REQ_OP_ZONE_CLOSE:
+	case REQ_OP_ZONE_FINISH:
 		if (!result) {
 			good_bytes = blk_rq_bytes(req);
 			scsi_set_resid(SCpnt, 0);
@@ -1787,6 +1848,7 @@ static int sd_done(struct scsi_cmnd *SCpnt)
 			good_bytes = 0;
 			scsi_set_resid(SCpnt, blk_rq_bytes(req));
 		}
+		break;
 	}
 
 	if (result) {
@@ -1829,6 +1891,10 @@ static int sd_done(struct scsi_cmnd *SCpnt)
 			case UNMAP:
 				sd_config_discard(sdkp, SD_LBP_DISABLE);
 				break;
+			case ZBC_OUT:
+				if (sa == ZO_RESET_WRITE_POINTER)
+					sd_config_discard(sdkp, SD_LBP_DISABLE);
+				break;
 			case WRITE_SAME_16:
 			case WRITE_SAME:
 				if (unmap)
@@ -1847,7 +1913,11 @@ static int sd_done(struct scsi_cmnd *SCpnt)
 	default:
 		break;
 	}
+
  out:
+	if (sdkp->zoned == 1 || sdkp->device->type == TYPE_ZBC)
+		sd_zbc_done(SCpnt, &sshdr);
+
 	SCSI_LOG_HLCOMPLETE(1, scmd_printk(KERN_INFO, SCpnt,
 					   "sd_done: completed %d of %d bytes\n",
 					   good_bytes, scsi_bufflen(SCpnt)));
@@ -1982,7 +2052,6 @@ sd_spinup_disk(struct scsi_disk *sdkp)
 	}
 }
 
-
 /*
  * Determine whether disk supports Data Integrity Field.
  */
@@ -2132,6 +2201,9 @@ static int read_capacity_16(struct scsi_disk *sdkp, struct scsi_device *sdp,
 	/* Logical blocks per physical block exponent */
 	sdkp->physical_block_size = (1 << (buffer[13] & 0xf)) * sector_size;
 
+	/* RC basis */
+	sdkp->rc_basis = (buffer[12] >> 4) & 0x3;
+
 	/* Lowest aligned logical block */
 	alignment = ((buffer[14] & 0x3f) << 8 | buffer[15]) * sector_size;
 	blk_queue_alignment_offset(sdp->request_queue, alignment);
@@ -2322,6 +2394,11 @@ got_data:
 		sector_size = 512;
 	}
 	blk_queue_logical_block_size(sdp->request_queue, sector_size);
+	blk_queue_physical_block_size(sdp->request_queue,
+				      sdkp->physical_block_size);
+	sdkp->device->sector_size = sector_size;
+
+	sd_zbc_read_zones(sdkp, buffer);
 
 	{
 		char cap_str_2[10], cap_str_10[10];
@@ -2348,9 +2425,6 @@ got_data:
 	if (sdkp->capacity > 0xffffffff)
 		sdp->use_16_for_rw = 1;
 
-	blk_queue_physical_block_size(sdp->request_queue,
-				      sdkp->physical_block_size);
-	sdkp->device->sector_size = sector_size;
 }
 
 /* called with buffer of length 512 */
@@ -2612,7 +2686,7 @@ static void sd_read_app_tag_own(struct scsi_disk *sdkp, unsigned char *buffer)
 	struct scsi_mode_data data;
 	struct scsi_sense_hdr sshdr;
 
-	if (sdp->type != TYPE_DISK)
+	if (sdp->type != TYPE_DISK && sdp->type != TYPE_ZBC)
 		return;
 
 	if (sdkp->protection_type == 0)
@@ -2719,6 +2793,7 @@ static void sd_read_block_limits(struct scsi_disk *sdkp)
  */
 static void sd_read_block_characteristics(struct scsi_disk *sdkp)
 {
+	struct request_queue *q = sdkp->disk->queue;
 	unsigned char *buffer;
 	u16 rot;
 	const int vpd_len = 64;
@@ -2733,10 +2808,21 @@ static void sd_read_block_characteristics(struct scsi_disk *sdkp)
 	rot = get_unaligned_be16(&buffer[4]);
 
 	if (rot == 1) {
-		queue_flag_set_unlocked(QUEUE_FLAG_NONROT, sdkp->disk->queue);
-		queue_flag_clear_unlocked(QUEUE_FLAG_ADD_RANDOM, sdkp->disk->queue);
+		queue_flag_set_unlocked(QUEUE_FLAG_NONROT, q);
+		queue_flag_clear_unlocked(QUEUE_FLAG_ADD_RANDOM, q);
 	}
 
+	sdkp->zoned = (buffer[8] >> 4) & 3;
+	if (sdkp->zoned == 1)
+		q->limits.zoned = BLK_ZONED_HA;
+	else if (sdkp->device->type == TYPE_ZBC)
+		q->limits.zoned = BLK_ZONED_HM;
+	else
+		q->limits.zoned = BLK_ZONED_NONE;
+	if (blk_queue_zoned(q) && sdkp->first_scan)
+		sd_printk(KERN_NOTICE, sdkp, "Host-%s zoned block device\n",
+			  q->limits.zoned == BLK_ZONED_HM ? "managed" : "aware");
+
  out:
 	kfree(buffer);
 }
@@ -2835,14 +2921,14 @@ static int sd_revalidate_disk(struct gendisk *disk)
 	 * react badly if we do.
 	 */
 	if (sdkp->media_present) {
-		sd_read_capacity(sdkp, buffer);
-
 		if (scsi_device_supports_vpd(sdp)) {
 			sd_read_block_provisioning(sdkp);
 			sd_read_block_limits(sdkp);
 			sd_read_block_characteristics(sdkp);
 		}
 
+		sd_read_capacity(sdkp, buffer);
+
 		sd_read_write_protect_flag(sdkp, buffer);
 		sd_read_cache_type(sdkp, buffer);
 		sd_read_app_tag_own(sdkp, buffer);
@@ -3040,9 +3126,16 @@ static int sd_probe(struct device *dev)
 
 	scsi_autopm_get_device(sdp);
 	error = -ENODEV;
-	if (sdp->type != TYPE_DISK && sdp->type != TYPE_MOD && sdp->type != TYPE_RBC)
+	if (sdp->type != TYPE_DISK &&
+	    sdp->type != TYPE_ZBC &&
+	    sdp->type != TYPE_MOD &&
+	    sdp->type != TYPE_RBC)
 		goto out;
 
+#ifndef CONFIG_BLK_DEV_ZONED
+	if (sdp->type == TYPE_ZBC)
+		goto out;
+#endif
 	SCSI_LOG_HLQUEUE(3, sdev_printk(KERN_INFO, sdp,
 					"sd_probe\n"));
 
@@ -3146,6 +3239,8 @@ static int sd_remove(struct device *dev)
 	del_gendisk(sdkp->disk);
 	sd_shutdown(dev);
 
+	sd_zbc_remove(sdkp);
+
 	blk_register_region(devt, SD_MINORS, NULL,
 			    sd_default_probe, NULL, NULL);
 
diff --git a/drivers/scsi/sd.h b/drivers/scsi/sd.h
index 765a6f1..3452871 100644
--- a/drivers/scsi/sd.h
+++ b/drivers/scsi/sd.h
@@ -56,6 +56,7 @@ enum {
 	SD_LBP_WS16,		/* Use WRITE SAME(16) with UNMAP bit */
 	SD_LBP_WS10,		/* Use WRITE SAME(10) with UNMAP bit */
 	SD_LBP_ZERO,		/* Use WRITE SAME(10) with zero payload */
+	SD_ZBC_RESET_WP,	/* Use RESET WRITE POINTER */
 	SD_LBP_DISABLE,		/* Discard disabled due to failed cmd */
 };
 
@@ -64,6 +65,11 @@ struct scsi_disk {
 	struct scsi_device *device;
 	struct device	dev;
 	struct gendisk	*disk;
+#ifdef CONFIG_BLK_DEV_ZONED
+	struct workqueue_struct *zone_work_q;
+	sector_t zone_sectors;
+	unsigned int nr_zones;
+#endif
 	atomic_t	openers;
 	sector_t	capacity;	/* size in logical blocks */
 	u32		max_xfer_blocks;
@@ -94,6 +100,8 @@ struct scsi_disk {
 	unsigned	lbpvpd : 1;
 	unsigned	ws10 : 1;
 	unsigned	ws16 : 1;
+	unsigned	rc_basis: 2;
+	unsigned	zoned: 2;
 };
 #define to_scsi_disk(obj) container_of(obj,struct scsi_disk,dev)
 
@@ -156,6 +164,13 @@ static inline unsigned int logical_to_bytes(struct scsi_device *sdev, sector_t b
 	return blocks * sdev->sector_size;
 }
 
+static inline sector_t sectors_to_logical(struct scsi_device *sdev, sector_t sector)
+{
+	return sector >> (ilog2(sdev->sector_size) - 9);
+}
+
+extern void sd_config_discard(struct scsi_disk *, unsigned int);
+
 /*
  * A DIF-capable target device can be formatted with different
  * protection schemes.  Currently 0 through 3 are defined:
@@ -269,4 +284,57 @@ static inline void sd_dif_complete(struct scsi_cmnd *cmd, unsigned int a)
 
 #endif /* CONFIG_BLK_DEV_INTEGRITY */
 
+#ifdef CONFIG_BLK_DEV_ZONED
+
+extern void sd_zbc_read_zones(struct scsi_disk *, char *);
+extern void sd_zbc_remove(struct scsi_disk *);
+extern int sd_zbc_setup_read_write(struct scsi_disk *, struct request *,
+				   sector_t, unsigned int *);
+extern int sd_zbc_setup_report_cmnd(struct scsi_cmnd *);
+extern int sd_zbc_setup_reset_cmnd(struct scsi_cmnd *);
+extern int sd_zbc_setup_open_cmnd(struct scsi_cmnd *);
+extern int sd_zbc_setup_close_cmnd(struct scsi_cmnd *);
+extern int sd_zbc_setup_finish_cmnd(struct scsi_cmnd *);
+extern void sd_zbc_done(struct scsi_cmnd *, struct scsi_sense_hdr *);
+
+#else /* CONFIG_BLK_DEV_ZONED */
+
+static inline void sd_zbc_read_zones(struct scsi_disk *sdkp,
+				     unsigned char *buf) {}
+static inline void sd_zbc_remove(struct scsi_disk *sdkp) {}
+
+static inline int sd_zbc_setup_read_write(struct scsi_disk *sdkp,
+					  struct request *rq, sector_t sector,
+					  unsigned int *num_sectors)
+{
+	/* Let the drive fail requests */
+	return BLKPREP_OK;
+}
+
+static inline int sd_zbc_setup_report_cmnd(struct scsi_cmnd *cmd)
+{
+	return BLKPREP_KILL;
+}
+static inline int sd_zbc_setup_reset_cmnd(struct scsi_cmnd *cmd)
+{
+	return BLKPREP_KILL;
+}
+static inline int sd_zbc_setup_open_cmnd(struct scsi_cmnd *cmd)
+{
+	return BLKPREP_KILL;
+}
+static inline int sd_zbc_setup_close_cmnd(struct scsi_cmnd *cmd)
+{
+	return BLKPREP_KILL;
+}
+static inline int sd_zbc_setup_finish_cmnd(struct scsi_cmnd *cmd)
+{
+	return BLKPREP_KILL;
+}
+
+static inline void sd_zbc_done(struct scsi_cmnd *cmd,
+			       struct scsi_sense_hdr *sshdr) {}
+
+#endif /* CONFIG_BLK_DEV_ZONED */
+
 #endif /* _SCSI_DISK_H */
diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
new file mode 100644
index 0000000..ec9c3fc
--- /dev/null
+++ b/drivers/scsi/sd_zbc.c
@@ -0,0 +1,1097 @@
+/*
+ * SCSI Zoned Block commands
+ *
+ * Copyright (C) 2014-2015 SUSE Linux GmbH
+ * Written by: Hannes Reinecke <hare@suse.de>
+ * Modified by: Damien Le Moal <damien.lemoal@hgst.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License version
+ * 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; see the file COPYING.  If not, write to
+ * the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139,
+ * USA.
+ *
+ */
+
+#include <linux/blkdev.h>
+#include <linux/rbtree.h>
+
+#include <asm/unaligned.h>
+
+#include <scsi/scsi.h>
+#include <scsi/scsi_cmnd.h>
+#include <scsi/scsi_dbg.h>
+#include <scsi/scsi_device.h>
+#include <scsi/scsi_driver.h>
+#include <scsi/scsi_host.h>
+#include <scsi/scsi_eh.h>
+
+#include "sd.h"
+#include "scsi_priv.h"
+
+enum zbc_zone_type {
+	ZBC_ZONE_TYPE_CONV = 0x1,
+	ZBC_ZONE_TYPE_SEQWRITE_REQ,
+	ZBC_ZONE_TYPE_SEQWRITE_PREF,
+	ZBC_ZONE_TYPE_RESERVED,
+};
+
+enum zbc_zone_cond {
+	ZBC_ZONE_COND_NO_WP,
+	ZBC_ZONE_COND_EMPTY,
+	ZBC_ZONE_COND_IMP_OPEN,
+	ZBC_ZONE_COND_EXP_OPEN,
+	ZBC_ZONE_COND_CLOSED,
+	ZBC_ZONE_COND_READONLY = 0xd,
+	ZBC_ZONE_COND_FULL,
+	ZBC_ZONE_COND_OFFLINE,
+};
+
+#define SD_ZBC_BUF_SIZE 131072
+
+#define sd_zbc_debug(sdkp, fmt, args...)			\
+	pr_debug("%s %s [%s]: " fmt,				\
+		 dev_driver_string(&(sdkp)->device->sdev_gendev), \
+		 dev_name(&(sdkp)->device->sdev_gendev),	 \
+		 (sdkp)->disk->disk_name, ## args)
+
+#define sd_zbc_debug_ratelimit(sdkp, fmt, args...)		\
+	do {							\
+		if (printk_ratelimit())				\
+			sd_zbc_debug(sdkp, fmt, ## args);	\
+	} while( 0 )
+
+#define sd_zbc_err(sdkp, fmt, args...)				\
+	pr_err("%s %s [%s]: " fmt,				\
+	       dev_driver_string(&(sdkp)->device->sdev_gendev),	\
+	       dev_name(&(sdkp)->device->sdev_gendev),		\
+	       (sdkp)->disk->disk_name, ## args)
+
+struct zbc_zone_work {
+	struct work_struct 	zone_work;
+	struct scsi_disk 	*sdkp;
+	sector_t		sector;
+	sector_t		nr_sects;
+	bool 			init;
+	unsigned int		nr_zones;
+};
+
+struct blk_zone *zbc_desc_to_zone(struct scsi_disk *sdkp, unsigned char *rec)
+{
+	struct blk_zone *zone;
+
+	zone = kzalloc(sizeof(struct blk_zone), GFP_KERNEL);
+	if (!zone)
+		return NULL;
+
+	/* Zone type */
+	switch(rec[0] & 0x0f) {
+	case ZBC_ZONE_TYPE_CONV:
+	case ZBC_ZONE_TYPE_SEQWRITE_REQ:
+	case ZBC_ZONE_TYPE_SEQWRITE_PREF:
+		zone->type = rec[0] & 0x0f;
+		break;
+	default:
+		zone->type = BLK_ZONE_TYPE_UNKNOWN;
+		break;
+	}
+
+	/* Zone condition */
+	zone->cond = (rec[1] >> 4) & 0xf;
+	if (rec[1] & 0x01)
+		zone->reset = 1;
+	if (rec[1] & 0x02)
+		zone->non_seq = 1;
+
+	/* Zone start sector and length */
+	zone->len = logical_to_sectors(sdkp->device,
+				       get_unaligned_be64(&rec[8]));
+	zone->start = logical_to_sectors(sdkp->device,
+					 get_unaligned_be64(&rec[16]));
+
+	/* Zone write pointer */
+	if (blk_zone_is_empty(zone) &&
+	    zone->wp != zone->start)
+		zone->wp = zone->start;
+	else if (blk_zone_is_full(zone))
+		zone->wp = zone->start + zone->len;
+	else if (blk_zone_is_seq(zone))
+		zone->wp = logical_to_sectors(sdkp->device,
+					      get_unaligned_be64(&rec[24]));
+	else
+		zone->wp = (sector_t)-1;
+
+	return zone;
+}
+
+static int zbc_parse_zones(struct scsi_disk *sdkp, unsigned char *buf,
+			   unsigned int buf_len, sector_t *next_sector)
+{
+	struct request_queue *q = sdkp->disk->queue;
+	sector_t capacity = logical_to_sectors(sdkp->device, sdkp->capacity);
+	unsigned char *rec = buf;
+	unsigned int zone_len, list_length;
+
+	/* Parse REPORT ZONES header */
+	list_length = get_unaligned_be32(&buf[0]);
+	rec = buf + 64;
+	list_length += 64;
+
+	if (list_length < buf_len)
+		buf_len = list_length;
+
+	/* Parse REPORT ZONES zone descriptors */
+	*next_sector = capacity;
+	while (rec < buf + buf_len) {
+
+		struct blk_zone *new, *old;
+
+		new = zbc_desc_to_zone(sdkp, rec);
+		if (!new)
+			return -ENOMEM;
+
+		zone_len = new->len;
+		*next_sector = new->start + zone_len;
+
+		old = blk_insert_zone(q, new);
+		if (old) {
+			blk_lock_zone(old);
+
+			/*
+			 * Always update the zone state flags and the zone
+			 * offline and read-only condition as the drive may
+			 * change those independently of the commands being
+			 * executed
+			 */
+			old->reset = new->reset;
+			old->non_seq = new->non_seq;
+			if (blk_zone_is_offline(new) ||
+			    blk_zone_is_readonly(new))
+				old->cond = new->cond;
+
+			if (blk_zone_in_update(old)) {
+				old->cond = new->cond;
+				old->wp = new->wp;
+				blk_clear_zone_update(old);
+			}
+
+			blk_unlock_zone(old);
+
+			kfree(new);
+		}
+
+		rec += 64;
+
+	}
+
+	return 0;
+}
+
+/**
+ * sd_zbc_report_zones - Issue a REPORT ZONES scsi command
+ * @sdkp: SCSI disk to which the command should be send
+ * @buffer: response buffer
+ * @bufflen: length of @buffer
+ * @start_sector: logical sector for the zone information should be reported
+ * @option: reporting option to be used
+ * @partial: flag to set the 'partial' bit for report zones command
+ */
+int sd_zbc_report_zones(struct scsi_disk *sdkp, unsigned char *buffer,
+			int bufflen, sector_t start_sector,
+			enum zbc_zone_reporting_options option, bool partial)
+{
+	struct scsi_device *sdp = sdkp->device;
+	const int timeout = sdp->request_queue->rq_timeout;
+	struct scsi_sense_hdr sshdr;
+	sector_t start_lba = sectors_to_logical(sdkp->device, start_sector);
+	unsigned char cmd[16];
+	int result;
+
+	if (!scsi_device_online(sdp))
+		return -ENODEV;
+
+	sd_zbc_debug(sdkp, "REPORT ZONES lba %zu len %d\n",
+		     start_lba, bufflen);
+
+	memset(cmd, 0, 16);
+	cmd[0] = ZBC_IN;
+	cmd[1] = ZI_REPORT_ZONES;
+	put_unaligned_be64(start_lba, &cmd[2]);
+	put_unaligned_be32(bufflen, &cmd[10]);
+	cmd[14] = (partial ? ZBC_REPORT_ZONE_PARTIAL : 0) | option;
+	memset(buffer, 0, bufflen);
+
+	result = scsi_execute_req(sdp, cmd, DMA_FROM_DEVICE,
+				buffer, bufflen, &sshdr,
+				timeout, SD_MAX_RETRIES, NULL);
+
+	if (result) {
+		sd_zbc_err(sdkp,
+			   "REPORT ZONES lba %zu failed with %d/%d\n",
+			   start_lba, host_byte(result), driver_byte(result));
+		return -EIO;
+	}
+
+	return 0;
+}
+
+/**
+ * Set or clear the update flag of all zones contained
+ * in the range sector..sector+nr_sects.
+ * Return the number of zones marked/cleared.
+ */
+static int __sd_zbc_zones_updating(struct scsi_disk *sdkp,
+				   sector_t sector, sector_t nr_sects,
+				   bool set)
+{
+	struct request_queue *q = sdkp->disk->queue;
+	struct blk_zone *zone;
+	struct rb_node *node;
+	unsigned long flags;
+	int nr_zones = 0;
+
+	if (!nr_sects) {
+		/* All zones */
+		sector = 0;
+		nr_sects = logical_to_sectors(sdkp->device, sdkp->capacity);
+	}
+
+	spin_lock_irqsave(&q->zones_lock, flags);
+	for (node = rb_first(&q->zones); node && nr_sects; node = rb_next(node)) {
+		zone = rb_entry(node, struct blk_zone, node);
+		if (sector < zone->start || sector >= (zone->start + zone->len))
+			continue;
+		if (set) {
+			if (!test_and_set_bit_lock(BLK_ZONE_IN_UPDATE, &zone->flags))
+				nr_zones++;
+		} else if (test_and_clear_bit(BLK_ZONE_IN_UPDATE, &zone->flags)) {
+			wake_up_bit(&zone->flags, BLK_ZONE_IN_UPDATE);
+			nr_zones++;
+		}
+		sector = zone->start + zone->len;
+		if (nr_sects <= zone->len)
+			nr_sects = 0;
+		else
+			nr_sects -= zone->len;
+	}
+	spin_unlock_irqrestore(&q->zones_lock, flags);
+
+	return nr_zones;
+}
+
+static inline int sd_zbc_set_zones_updating(struct scsi_disk *sdkp,
+					    sector_t sector, sector_t nr_sects)
+{
+	return __sd_zbc_zones_updating(sdkp, sector, nr_sects, true);
+}
+
+static inline int sd_zbc_clear_zones_updating(struct scsi_disk *sdkp,
+					      sector_t sector, sector_t nr_sects)
+{
+	return __sd_zbc_zones_updating(sdkp, sector, nr_sects, false);
+}
+
+static void sd_zbc_start_queue(struct request_queue *q)
+{
+	unsigned long flags;
+
+	if (q->mq_ops) {
+		blk_mq_start_hw_queues(q);
+	} else {
+		spin_lock_irqsave(q->queue_lock, flags);
+		blk_start_queue(q);
+		spin_unlock_irqrestore(q->queue_lock, flags);
+	}
+}
+
+static void sd_zbc_update_zone_work(struct work_struct *work)
+{
+	struct zbc_zone_work *zwork =
+		container_of(work, struct zbc_zone_work, zone_work);
+	struct scsi_disk *sdkp = zwork->sdkp;
+	sector_t capacity = logical_to_sectors(sdkp->device, sdkp->capacity);
+	struct request_queue *q = sdkp->disk->queue;
+	sector_t end_sector, sector = zwork->sector;
+	unsigned int bufsize;
+	unsigned char *buf;
+	int ret = -ENOMEM;
+
+	/* Get a buffer */
+	if (!zwork->nr_zones) {
+		bufsize = SD_ZBC_BUF_SIZE;
+	} else {
+		bufsize = (zwork->nr_zones + 1) * 64;
+		if (bufsize < 512)
+			bufsize = 512;
+		else if (bufsize > SD_ZBC_BUF_SIZE)
+				bufsize = SD_ZBC_BUF_SIZE;
+		else
+			bufsize = (bufsize + 511) & ~511;
+	}
+	buf = kmalloc(bufsize, GFP_KERNEL | GFP_DMA);
+	if (!buf) {
+		sd_zbc_err(sdkp, "Failed to allocate zone report buffer\n");
+		goto done_free;
+	}
+
+	/* Process sector range */
+	end_sector = zwork->sector + zwork->nr_sects;
+	while(sector < min(end_sector, capacity)) {
+
+		/* Get zone report */
+		ret = sd_zbc_report_zones(sdkp, buf, bufsize, sector,
+					  ZBC_ZONE_REPORTING_OPTION_ALL, true);
+		if (ret)
+			break;
+
+		ret = zbc_parse_zones(sdkp, buf, bufsize, &sector);
+		if (ret)
+			break;
+
+		/* Kick start the queue to allow requests waiting */
+		/* for the zones just updated to run              */
+		sd_zbc_start_queue(q);
+
+	}
+
+done_free:
+	if (ret)
+		sd_zbc_clear_zones_updating(sdkp, zwork->sector, zwork->nr_sects);
+	if (buf)
+		kfree(buf);
+	kfree(zwork);
+}
+
+/**
+ * sd_zbc_update_zones - Update zone information for zones starting
+ * from @start_sector. If not in init mode, the update is done only
+ * for zones marked with update flag.
+ * @sdkp: SCSI disk for which the zone information needs to be updated
+ * @start_sector: First sector of the first zone to be updated
+ * @bufsize: buffersize to be allocated for report zones
+ */
+static int sd_zbc_update_zones(struct scsi_disk *sdkp,
+			       sector_t sector, sector_t nr_sects,
+			       gfp_t gfpflags, bool init)
+{
+	struct zbc_zone_work *zwork;
+
+	zwork = kzalloc(sizeof(struct zbc_zone_work), gfpflags);
+	if (!zwork) {
+		sd_zbc_err(sdkp, "Failed to allocate zone work\n");
+		return -ENOMEM;
+	}
+
+	if (!nr_sects) {
+		/* All zones */
+		sector = 0;
+		nr_sects = logical_to_sectors(sdkp->device, sdkp->capacity);
+	}
+
+	INIT_WORK(&zwork->zone_work, sd_zbc_update_zone_work);
+	zwork->sdkp = sdkp;
+	zwork->sector = sector;
+	zwork->nr_sects = nr_sects;
+	zwork->init = init;
+
+	if (!init)
+		/* Mark the zones falling in the report as updating */
+		zwork->nr_zones = sd_zbc_set_zones_updating(sdkp, sector, nr_sects);
+
+	if (init || zwork->nr_zones)
+		queue_work(sdkp->zone_work_q, &zwork->zone_work);
+	else
+		kfree(zwork);
+
+	return 0;
+}
+
+int sd_zbc_setup_report_cmnd(struct scsi_cmnd *cmd)
+{
+	struct request *rq = cmd->request;
+	struct gendisk *disk = rq->rq_disk;
+	struct scsi_disk *sdkp = scsi_disk(disk);
+	int ret;
+
+	if (!sdkp->zone_work_q)
+		return BLKPREP_KILL;
+
+	ret = sd_zbc_update_zones(sdkp, blk_rq_pos(rq), blk_rq_sectors(rq),
+				  GFP_ATOMIC, false);
+	if (unlikely(ret))
+		return BLKPREP_DEFER;
+
+	return BLKPREP_DONE;
+}
+
+static void sd_zbc_setup_action_cmnd(struct scsi_cmnd *cmd,
+				     u8 action,
+				     bool all)
+{
+	struct request *rq = cmd->request;
+	struct scsi_disk *sdkp = scsi_disk(rq->rq_disk);
+	sector_t lba;
+
+	cmd->cmd_len = 16;
+	cmd->cmnd[0] = ZBC_OUT;
+	cmd->cmnd[1] = action;
+	if (all) {
+		cmd->cmnd[14] |= 0x01;
+	} else {
+		lba = sectors_to_logical(sdkp->device, blk_rq_pos(rq));
+		put_unaligned_be64(lba, &cmd->cmnd[2]);
+	}
+
+	rq->completion_data = NULL;
+	rq->timeout = SD_TIMEOUT;
+	rq->__data_len = blk_rq_bytes(rq);
+
+	/* Don't retry */
+	cmd->allowed = 0;
+	cmd->transfersize = 0;
+	cmd->sc_data_direction = DMA_NONE;
+}
+
+int sd_zbc_setup_reset_cmnd(struct scsi_cmnd *cmd)
+{
+	struct request *rq = cmd->request;
+	struct scsi_disk *sdkp = scsi_disk(rq->rq_disk);
+	sector_t sector = blk_rq_pos(rq);
+	sector_t nr_sects = blk_rq_sectors(rq);
+	struct blk_zone *zone = NULL;
+	int ret = BLKPREP_OK;
+
+	if (nr_sects) {
+		zone = blk_lookup_zone(rq->q, sector);
+		if (!zone)
+			return BLKPREP_KILL;
+	}
+
+	if (zone) {
+
+		blk_lock_zone(zone);
+
+		/* If the zone is being updated, wait */
+		if (blk_zone_in_update(zone)) {
+			ret = BLKPREP_DEFER;
+			goto out;
+		}
+
+		if (zone->type == BLK_ZONE_TYPE_UNKNOWN) {
+			sd_zbc_debug(sdkp,
+				     "Discarding unknown zone %zu\n",
+				     zone->start);
+			ret = BLKPREP_KILL;
+			goto out;
+		}
+
+		/* Nothing to do for conventional sequential zones */
+		if (blk_zone_is_conv(zone)) {
+			ret = BLKPREP_DONE;
+			goto out;
+		}
+
+		if (!blk_try_write_lock_zone(zone)) {
+			ret = BLKPREP_DEFER;
+			goto out;
+		}
+
+		/* Nothing to do if the zone is already empty */
+		if (blk_zone_is_empty(zone)) {
+			blk_write_unlock_zone(zone);
+			ret = BLKPREP_DONE;
+			goto out;
+		}
+
+		if (sector != zone->start ||
+		    (nr_sects != zone->len)) {
+			sd_printk(KERN_ERR, sdkp,
+				  "Unaligned reset wp request, start %zu/%zu"
+				  " len %zu/%zu\n",
+				  zone->start, sector, zone->len, nr_sects);
+			blk_write_unlock_zone(zone);
+			ret = BLKPREP_KILL;
+			goto out;
+		}
+
+	}
+
+	sd_zbc_setup_action_cmnd(cmd, ZO_RESET_WRITE_POINTER, !zone);
+
+out:
+	if (zone) {
+		if (ret == BLKPREP_OK) {
+			/*
+			 * Opportunistic update. Will be fixed up
+			 * with zone update if the command fails,
+			 */
+			zone->wp = zone->start;
+			zone->cond = BLK_ZONE_COND_EMPTY;
+			zone->reset = 0;
+			zone->non_seq = 0;
+		}
+		blk_unlock_zone(zone);
+	}
+
+	return ret;
+}
+
+int sd_zbc_setup_open_cmnd(struct scsi_cmnd *cmd)
+{
+	struct request *rq = cmd->request;
+	struct scsi_disk *sdkp = scsi_disk(rq->rq_disk);
+	sector_t sector = blk_rq_pos(rq);
+	sector_t nr_sects = blk_rq_sectors(rq);
+	struct blk_zone *zone = NULL;
+	int ret = BLKPREP_OK;
+
+	if (nr_sects) {
+		zone = blk_lookup_zone(rq->q, sector);
+		if (!zone)
+			return BLKPREP_KILL;
+	}
+
+	if (zone) {
+
+		blk_lock_zone(zone);
+
+		/* If the zone is being updated, wait */
+		if (blk_zone_in_update(zone)) {
+			ret = BLKPREP_DEFER;
+			goto out;
+		}
+
+		if (zone->type == BLK_ZONE_TYPE_UNKNOWN) {
+			sd_zbc_debug(sdkp,
+				     "Opening unknown zone %zu\n",
+				     zone->start);
+			ret = BLKPREP_KILL;
+			goto out;
+		}
+
+		/*
+		 * Nothing to do for conventional zones,
+		 * zones already open or full zones.
+		 */
+		if (blk_zone_is_conv(zone) ||
+		    blk_zone_is_open(zone) ||
+		    blk_zone_is_full(zone)) {
+			ret = BLKPREP_DONE;
+			goto out;
+		}
+
+		if (sector != zone->start ||
+		    (nr_sects != zone->len)) {
+			sd_printk(KERN_ERR, sdkp,
+				  "Unaligned open zone request, start %zu/%zu"
+				  " len %zu/%zu\n",
+				  zone->start, sector, zone->len, nr_sects);
+			ret = BLKPREP_KILL;
+			goto out;
+		}
+
+	}
+
+	sd_zbc_setup_action_cmnd(cmd, ZO_OPEN_ZONE, !zone);
+
+out:
+	if (zone) {
+		if (ret == BLKPREP_OK)
+			/*
+			 * Opportunistic update. Will be fixed up
+			 * with zone update if the command fails.
+			 */
+			zone->cond = BLK_ZONE_COND_EXP_OPEN;
+		blk_unlock_zone(zone);
+	}
+
+	return ret;
+}
+
+int sd_zbc_setup_close_cmnd(struct scsi_cmnd *cmd)
+{
+	struct request *rq = cmd->request;
+	struct scsi_disk *sdkp = scsi_disk(rq->rq_disk);
+	sector_t sector = blk_rq_pos(rq);
+	sector_t nr_sects = blk_rq_sectors(rq);
+	struct blk_zone *zone = NULL;
+	int ret = BLKPREP_OK;
+
+	if (nr_sects) {
+		zone = blk_lookup_zone(rq->q, sector);
+		if (!zone)
+			return BLKPREP_KILL;
+	}
+
+	if (zone) {
+
+		blk_lock_zone(zone);
+
+		/* If the zone is being updated, wait */
+		if (blk_zone_in_update(zone)) {
+			ret = BLKPREP_DEFER;
+			goto out;
+		}
+
+		if (zone->type == BLK_ZONE_TYPE_UNKNOWN) {
+			sd_zbc_debug(sdkp,
+				     "Closing unknown zone %zu\n",
+				     zone->start);
+			ret = BLKPREP_KILL;
+			goto out;
+		}
+
+		/*
+		 * Nothing to do for conventional zones,
+		 * full zones or empty zones.
+		 */
+		if (blk_zone_is_conv(zone) ||
+		    blk_zone_is_full(zone) ||
+		    blk_zone_is_empty(zone)) {
+			ret = BLKPREP_DONE;
+			goto out;
+		}
+
+		if (sector != zone->start ||
+		    (nr_sects != zone->len)) {
+			sd_printk(KERN_ERR, sdkp,
+				  "Unaligned close zone request, start %zu/%zu"
+				  " len %zu/%zu\n",
+				  zone->start, sector, zone->len, nr_sects);
+			ret = BLKPREP_KILL;
+			goto out;
+		}
+
+	}
+
+	sd_zbc_setup_action_cmnd(cmd, ZO_CLOSE_ZONE, !zone);
+
+out:
+	if (zone) {
+		if (ret == BLKPREP_OK)
+			/*
+			 * Opportunistic update. Will be fixed up
+			 * with zone update if the command fails.
+			 */
+			zone->cond = BLK_ZONE_COND_CLOSED;
+		blk_unlock_zone(zone);
+	}
+
+	return ret;
+}
+
+int sd_zbc_setup_finish_cmnd(struct scsi_cmnd *cmd)
+{
+	struct request *rq = cmd->request;
+	struct scsi_disk *sdkp = scsi_disk(rq->rq_disk);
+	sector_t sector = blk_rq_pos(rq);
+	sector_t nr_sects = blk_rq_sectors(rq);
+	struct blk_zone *zone = NULL;
+	int ret = BLKPREP_OK;
+
+	if (nr_sects) {
+		zone = blk_lookup_zone(rq->q, sector);
+		if (!zone)
+			return BLKPREP_KILL;
+	}
+
+	if (zone) {
+
+		blk_lock_zone(zone);
+
+		/* If the zone is being updated, wait */
+		if (blk_zone_in_update(zone)) {
+			ret = BLKPREP_DEFER;
+			goto out;
+		}
+
+		if (zone->type == BLK_ZONE_TYPE_UNKNOWN) {
+			sd_zbc_debug(sdkp,
+				     "Finishing unknown zone %zu\n",
+				     zone->start);
+			ret = BLKPREP_KILL;
+			goto out;
+		}
+
+		/* Nothing to do for conventional zones and full zones */
+		if (blk_zone_is_conv(zone) ||
+		    blk_zone_is_full(zone)) {
+			ret = BLKPREP_DONE;
+			goto out;
+		}
+
+		if (sector != zone->start ||
+		    (nr_sects != zone->len)) {
+			sd_printk(KERN_ERR, sdkp,
+				  "Unaligned finish zone request, start %zu/%zu"
+				  " len %zu/%zu\n",
+				  zone->start, sector, zone->len, nr_sects);
+			ret = BLKPREP_KILL;
+			goto out;
+		}
+
+	}
+
+	sd_zbc_setup_action_cmnd(cmd, ZO_FINISH_ZONE, !zone);
+
+out:
+	if (zone) {
+		if (ret == BLKPREP_OK) {
+			/*
+			 * Opportunistic update. Will be fixed up
+			 * with zone update if the command fails.
+			 */
+			zone->cond = BLK_ZONE_COND_FULL;
+			if (blk_zone_is_seq(zone))
+				zone->wp = zone->start + zone->len;
+		}
+		blk_unlock_zone(zone);
+	}
+
+	return ret;
+}
+
+int sd_zbc_setup_read_write(struct scsi_disk *sdkp, struct request *rq,
+			    sector_t sector, unsigned int *num_sectors)
+{
+	struct blk_zone *zone;
+	unsigned int sectors = *num_sectors;
+	int ret = BLKPREP_OK;
+
+	zone = blk_lookup_zone(rq->q, sector);
+	if (!zone)
+		/* Let the drive handle the request */
+		return BLKPREP_OK;
+
+	blk_lock_zone(zone);
+
+	/* If the zone is being updated, wait */
+	if (blk_zone_in_update(zone)) {
+		ret = BLKPREP_DEFER;
+		goto out;
+	}
+
+	if (zone->type == BLK_ZONE_TYPE_UNKNOWN) {
+		sd_zbc_debug(sdkp,
+			     "Unknown zone %zu\n",
+			     zone->start);
+		ret = BLKPREP_KILL;
+		goto out;
+	}
+
+	/* For offline and read-only zones, let the drive fail the command */
+	if (blk_zone_is_offline(zone) ||
+	    blk_zone_is_readonly(zone))
+		goto out;
+
+	/* Do not allow zone boundaries crossing */
+	if (sector + sectors > zone->start + zone->len) {
+		ret = BLKPREP_KILL;
+		goto out;
+	}
+
+	/* For conventional zones, no checks */
+	if (blk_zone_is_conv(zone))
+		goto out;
+
+	if (req_op(rq) == REQ_OP_WRITE ||
+	    req_op(rq) == REQ_OP_WRITE_SAME) {
+
+		/*
+		 * Write requests may change the write pointer and
+		 * transition the zone condition to full. Changes
+		 * are oportunistic here. If the request fails, a
+		 * zone update will fix the zone information.
+		 */
+		if (blk_zone_is_seq_req(zone)) {
+
+			/*
+			 * Do not issue more than one write at a time per
+			 * zone. This solves write ordering problems due to
+			 * the unlocking of the request queue in the dispatch
+			 * path in the non scsi-mq case. For scsi-mq, this
+			 * also avoids potential write reordering when multiple
+			 * threads running on different CPUs write to the same
+			 * zone (with a synchronized sequential pattern).
+			 */
+			if (!blk_try_write_lock_zone(zone)) {
+				ret = BLKPREP_DEFER;
+				goto out;
+			}
+
+			/* For host-managed drives, writes are allowed */
+			/* only at the write pointer position.         */
+			if (zone->wp != sector) {
+				blk_write_unlock_zone(zone);
+				ret = BLKPREP_KILL;
+				goto out;
+			}
+
+			zone->wp += sectors;
+			if (zone->wp >= zone->start + zone->len) {
+				zone->cond = BLK_ZONE_COND_FULL;
+				zone->wp = zone->start + zone->len;
+			}
+
+		} else {
+
+			/* For host-aware drives, writes are allowed */
+			/* anywhere in the zone, but wp can only go  */
+			/* forward.                                  */
+			sector_t end_sector = sector + sectors;
+			if (sector == zone->wp &&
+			    end_sector >= zone->start + zone->len) {
+				zone->cond = BLK_ZONE_COND_FULL;
+				zone->wp = zone->start + zone->len;
+			} else if (end_sector > zone->wp) {
+				zone->wp = end_sector;
+			}
+
+		}
+
+	} else {
+
+		/* Check read after write pointer */
+		if (sector + sectors <= zone->wp)
+			goto out;
+
+		if (zone->wp <= sector) {
+			/* Read beyond WP: clear request buffer */
+			struct req_iterator iter;
+			struct bio_vec bvec;
+			unsigned long flags;
+			void *buf;
+			rq_for_each_segment(bvec, rq, iter) {
+				buf = bvec_kmap_irq(&bvec, &flags);
+				memset(buf, 0, bvec.bv_len);
+				flush_dcache_page(bvec.bv_page);
+				bvec_kunmap_irq(buf, &flags);
+			}
+			ret = BLKPREP_DONE;
+			goto out;
+		}
+
+		/* Read straddle WP position: limit request size */
+		*num_sectors = zone->wp - sector;
+
+	}
+
+out:
+	blk_unlock_zone(zone);
+
+	return ret;
+}
+
+void sd_zbc_done(struct scsi_cmnd *cmd,
+		 struct scsi_sense_hdr *sshdr)
+{
+	int result = cmd->result;
+	struct request *rq = cmd->request;
+	struct scsi_disk *sdkp = scsi_disk(rq->rq_disk);
+	struct request_queue *q = sdkp->disk->queue;
+	sector_t pos = blk_rq_pos(rq);
+	struct blk_zone *zone = NULL;
+	bool write_unlock = false;
+
+	/*
+	 * Get the target zone of commands of interest. Some may
+	 * apply to all zones so check the request sectors first.
+	 */
+	switch (req_op(rq)) {
+	case REQ_OP_DISCARD:
+	case REQ_OP_WRITE:
+	case REQ_OP_WRITE_SAME:
+	case REQ_OP_ZONE_RESET:
+		write_unlock = true;
+		/* fallthru */
+	case REQ_OP_ZONE_OPEN:
+	case REQ_OP_ZONE_CLOSE:
+	case REQ_OP_ZONE_FINISH:
+		if (blk_rq_sectors(rq))
+			zone = blk_lookup_zone(q, pos);
+		break;
+	}
+
+	if (zone && write_unlock)
+	    blk_write_unlock_zone(zone);
+
+	if (!result)
+		return;
+
+	if (sshdr->sense_key == ILLEGAL_REQUEST &&
+	    sshdr->asc == 0x21)
+		/*
+		 * It is unlikely that retrying requests failed with any
+		 * kind of alignement error will result in success. So don't
+		 * try. Report the error back to the user quickly so that
+		 * corrective actions can be taken after obtaining updated
+		 * zone information.
+		 */
+		cmd->allowed = 0;
+
+	/* On error, force an update unless this is a failed report */
+	if (req_op(rq) == REQ_OP_ZONE_REPORT)
+		sd_zbc_clear_zones_updating(sdkp, pos, blk_rq_sectors(rq));
+	else if (zone)
+		sd_zbc_update_zones(sdkp, zone->start, zone->len,
+				    GFP_ATOMIC, false);
+}
+
+void sd_zbc_read_zones(struct scsi_disk *sdkp, char *buf)
+{
+	struct request_queue *q = sdkp->disk->queue;
+	struct blk_zone *zone;
+	sector_t capacity;
+	sector_t sector;
+	bool init = false;
+	u32 rep_len;
+	int ret = 0;
+
+	if (sdkp->zoned != 1 && sdkp->device->type != TYPE_ZBC)
+		/*
+		 * Device managed or normal SCSI disk,
+		 * no special handling required
+		 */
+		return;
+
+	/* Do a report zone to get the maximum LBA to check capacity */
+	ret = sd_zbc_report_zones(sdkp, buf, SD_BUF_SIZE,
+				  0, ZBC_ZONE_REPORTING_OPTION_ALL, false);
+	if (ret < 0)
+		return;
+
+	rep_len = get_unaligned_be32(&buf[0]);
+	if (rep_len < 64) {
+		sd_printk(KERN_WARNING, sdkp,
+			  "REPORT ZONES report invalid length %u\n",
+			  rep_len);
+		return;
+	}
+
+	if (sdkp->rc_basis == 0) {
+		/* The max_lba field is the capacity of this device */
+		sector_t lba = get_unaligned_be64(&buf[8]);
+		if (lba + 1 > sdkp->capacity) {
+			if (sdkp->first_scan)
+				sd_printk(KERN_WARNING, sdkp,
+					  "Changing capacity from %zu "
+					  "to max LBA+1 %zu\n",
+					  sdkp->capacity,
+					  (sector_t) lba + 1);
+			sdkp->capacity = lba + 1;
+		}
+	}
+
+	/* Setup the zone work queue */
+	if (! sdkp->zone_work_q) {
+		sdkp->zone_work_q =
+			alloc_ordered_workqueue("zbc_wq_%s", WQ_MEM_RECLAIM,
+						sdkp->disk->disk_name);
+		if (!sdkp->zone_work_q) {
+			sdev_printk(KERN_WARNING, sdkp->device,
+				    "Create zoned disk workqueue failed\n");
+			return;
+		}
+		init = true;
+	}
+
+	/*
+	 * Parse what we already got. If all zones are not parsed yet,
+	 * kick start an update to get the remaining.
+	 */
+	capacity = logical_to_sectors(sdkp->device, sdkp->capacity);
+	ret = zbc_parse_zones(sdkp, buf, SD_BUF_SIZE, &sector);
+	if (ret == 0 && sector < capacity) {
+		sd_zbc_update_zones(sdkp, sector, capacity - sector,
+				    GFP_KERNEL, init);
+		drain_workqueue(sdkp->zone_work_q);
+	}
+	if (ret)
+		return;
+
+	/*
+	 * Analyze the zones layout: if all zones are the same size and
+	 * the size is a power of 2, chunk the device and map discard to
+	 * reset write pointer command. Otherwise, disable discard.
+	 */
+	sdkp->zone_sectors = 0;
+	sdkp->nr_zones = 0;
+	sector = 0;
+	while(sector < capacity) {
+
+		zone = blk_lookup_zone(q, sector);
+		if (!zone) {
+			sdkp->zone_sectors = 0;
+			sdkp->nr_zones = 0;
+			break;
+		}
+
+		sector += zone->len;
+
+		if (sdkp->zone_sectors == 0) {
+			sdkp->zone_sectors = zone->len;
+		} else if (sector != capacity &&
+			 zone->len != sdkp->zone_sectors) {
+			sdkp->zone_sectors = 0;
+			sdkp->nr_zones = 0;
+			break;
+		}
+
+		sdkp->nr_zones++;
+
+	}
+
+	if (!sdkp->zone_sectors ||
+	    !is_power_of_2(sdkp->zone_sectors)) {
+		sd_config_discard(sdkp, SD_LBP_DISABLE);
+		if (sdkp->first_scan)
+			sd_printk(KERN_NOTICE, sdkp,
+				  "%u zones (non constant zone size)\n",
+				  sdkp->nr_zones);
+		return;
+	}
+
+	/* Setup discard granularity to the zone size */
+	blk_queue_chunk_sectors(sdkp->disk->queue, sdkp->zone_sectors);
+	sdkp->max_unmap_blocks = sdkp->zone_sectors;
+	sdkp->unmap_alignment = sectors_to_logical(sdkp->device,
+						   sdkp->zone_sectors);
+	sdkp->unmap_granularity = sdkp->unmap_alignment;
+	sd_config_discard(sdkp, SD_ZBC_RESET_WP);
+
+	if (sdkp->first_scan) {
+		if (sdkp->nr_zones * sdkp->zone_sectors == capacity)
+			sd_printk(KERN_NOTICE, sdkp,
+				  "%u zones of %zu sectors\n",
+				  sdkp->nr_zones,
+				  sdkp->zone_sectors);
+		else
+			sd_printk(KERN_NOTICE, sdkp,
+				  "%u zones of %zu sectors "
+				  "+ 1 runt zone\n",
+				  sdkp->nr_zones - 1,
+				  sdkp->zone_sectors);
+	}
+}
+
+void sd_zbc_remove(struct scsi_disk *sdkp)
+{
+
+	sd_config_discard(sdkp, SD_LBP_DISABLE);
+
+	if (sdkp->zone_work_q) {
+		drain_workqueue(sdkp->zone_work_q);
+		destroy_workqueue(sdkp->zone_work_q);
+		sdkp->zone_work_q = NULL;
+		blk_drop_zones(sdkp->disk->queue);
+	}
+}
+
diff --git a/include/scsi/scsi_proto.h b/include/scsi/scsi_proto.h
index d1defd1..6ba66e0 100644
--- a/include/scsi/scsi_proto.h
+++ b/include/scsi/scsi_proto.h
@@ -299,4 +299,21 @@ struct scsi_lun {
 #define SCSI_ACCESS_STATE_MASK        0x0f
 #define SCSI_ACCESS_STATE_PREFERRED   0x80
 
+/* Reporting options for REPORT ZONES */
+enum zbc_zone_reporting_options {
+	ZBC_ZONE_REPORTING_OPTION_ALL = 0,
+	ZBC_ZONE_REPORTING_OPTION_EMPTY,
+	ZBC_ZONE_REPORTING_OPTION_IMPLICIT_OPEN,
+	ZBC_ZONE_REPORTING_OPTION_EXPLICIT_OPEN,
+	ZBC_ZONE_REPORTING_OPTION_CLOSED,
+	ZBC_ZONE_REPORTING_OPTION_FULL,
+	ZBC_ZONE_REPORTING_OPTION_READONLY,
+	ZBC_ZONE_REPORTING_OPTION_OFFLINE,
+	ZBC_ZONE_REPORTING_OPTION_NEED_RESET_WP = 0x10,
+	ZBC_ZONE_REPORTING_OPTION_NON_SEQWRITE,
+	ZBC_ZONE_REPORTING_OPTION_NON_WP = 0x3f,
+};
+
+#define ZBC_REPORT_ZONE_PARTIAL 0x80
+
 #endif /* _SCSI_PROTO_H_ */
-- 
2.7.4

Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice & Disclaimer:

This e-mail and any files transmitted with it may contain confidential or legally privileged information of WDC and/or its affiliates, and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system.

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 8/9] sd: Implement support for ZBC devices
@ 2016-09-19 21:27   ` Damien Le Moal
  0 siblings, 0 replies; 36+ messages in thread
From: Damien Le Moal @ 2016-09-19 21:27 UTC (permalink / raw)
  To: linux-scsi, linux-block
  Cc: martin.petersen, axboe, hare, shaun.tancheff, Hannes Reinecke,
	Damien Le Moal

From: Hannes Reinecke <hare@suse.com>

Implement ZBC support functions to setup zoned disks and fill the
block device zone information tree during the device scan. The
zone information tree is also always updated on disk revalidation.
This adds support for the REQ_OP_ZONE* operations and also implements
the new RESET_WP provisioning mode so that discard requests can be
mapped to the RESET WRITE POINTER command for devices with a constant
zone size.

The capacity read of the device triggers the zone information read
for zoned block devices. As this needs the device zone model, the
the call to sd_read_capacity is moved after the call to
sd_read_block_characteristics so that host-aware devices are
properlly initialized. The call to sd_zbc_read_zones in
sd_read_capacity may change the device capacity obtained with
the sd_read_capacity_16 function for devices reporting only the
capacity of conventional zones at the beginning of the LBA range
(i.e. devices with rc_basis et to 0).

Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
---
 drivers/scsi/Makefile     |    1 +
 drivers/scsi/sd.c         |  147 ++++--
 drivers/scsi/sd.h         |   68 +++
 drivers/scsi/sd_zbc.c     | 1097 +++++++++++++++++++++++++++++++++++++++++++++
 include/scsi/scsi_proto.h |   17 +
 5 files changed, 1304 insertions(+), 26 deletions(-)
 create mode 100644 drivers/scsi/sd_zbc.c

diff --git a/drivers/scsi/Makefile b/drivers/scsi/Makefile
index d539798..fabcb6d 100644
--- a/drivers/scsi/Makefile
+++ b/drivers/scsi/Makefile
@@ -179,6 +179,7 @@ hv_storvsc-y			:= storvsc_drv.o
 
 sd_mod-objs	:= sd.o
 sd_mod-$(CONFIG_BLK_DEV_INTEGRITY) += sd_dif.o
+sd_mod-$(CONFIG_BLK_DEV_ZONED) += sd_zbc.o
 
 sr_mod-objs	:= sr.o sr_ioctl.o sr_vendor.o
 ncr53c8xx-flags-$(CONFIG_SCSI_ZALON) \
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index d3e852a..46b8b78 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -92,6 +92,7 @@ MODULE_ALIAS_BLOCKDEV_MAJOR(SCSI_DISK15_MAJOR);
 MODULE_ALIAS_SCSI_DEVICE(TYPE_DISK);
 MODULE_ALIAS_SCSI_DEVICE(TYPE_MOD);
 MODULE_ALIAS_SCSI_DEVICE(TYPE_RBC);
+MODULE_ALIAS_SCSI_DEVICE(TYPE_ZBC);
 
 #if !defined(CONFIG_DEBUG_BLOCK_EXT_DEVT)
 #define SD_MINORS	16
@@ -99,7 +100,6 @@ MODULE_ALIAS_SCSI_DEVICE(TYPE_RBC);
 #define SD_MINORS	0
 #endif
 
-static void sd_config_discard(struct scsi_disk *, unsigned int);
 static void sd_config_write_same(struct scsi_disk *);
 static int  sd_revalidate_disk(struct gendisk *);
 static void sd_unlock_native_capacity(struct gendisk *disk);
@@ -162,7 +162,7 @@ cache_type_store(struct device *dev, struct device_attribute *attr,
 	static const char temp[] = "temporary ";
 	int len;
 
-	if (sdp->type != TYPE_DISK)
+	if (sdp->type != TYPE_DISK && sdp->type != TYPE_ZBC)
 		/* no cache control on RBC devices; theoretically they
 		 * can do it, but there's probably so many exceptions
 		 * it's not worth the risk */
@@ -261,7 +261,7 @@ allow_restart_store(struct device *dev, struct device_attribute *attr,
 	if (!capable(CAP_SYS_ADMIN))
 		return -EACCES;
 
-	if (sdp->type != TYPE_DISK)
+	if (sdp->type != TYPE_DISK && sdp->type != TYPE_ZBC)
 		return -EINVAL;
 
 	sdp->allow_restart = simple_strtoul(buf, NULL, 10);
@@ -369,6 +369,7 @@ static const char *lbp_mode[] = {
 	[SD_LBP_WS16]		= "writesame_16",
 	[SD_LBP_WS10]		= "writesame_10",
 	[SD_LBP_ZERO]		= "writesame_zero",
+	[SD_ZBC_RESET_WP]	= "reset_wp",
 	[SD_LBP_DISABLE]	= "disabled",
 };
 
@@ -391,6 +392,13 @@ provisioning_mode_store(struct device *dev, struct device_attribute *attr,
 	if (!capable(CAP_SYS_ADMIN))
 		return -EACCES;
 
+	if (sdkp->zoned == 1 || sdp->type == TYPE_ZBC) {
+		if (!strncmp(buf, lbp_mode[SD_ZBC_RESET_WP], 20)) {
+			sd_config_discard(sdkp, SD_ZBC_RESET_WP);
+			return count;
+		}
+		return -EINVAL;
+	}
 	if (sdp->type != TYPE_DISK)
 		return -EINVAL;
 
@@ -458,7 +466,7 @@ max_write_same_blocks_store(struct device *dev, struct device_attribute *attr,
 	if (!capable(CAP_SYS_ADMIN))
 		return -EACCES;
 
-	if (sdp->type != TYPE_DISK)
+	if (sdp->type != TYPE_DISK && sdp->type != TYPE_ZBC)
 		return -EINVAL;
 
 	err = kstrtoul(buf, 10, &max);
@@ -631,7 +639,7 @@ static unsigned char sd_setup_protect_cmnd(struct scsi_cmnd *scmd,
 	return protect;
 }
 
-static void sd_config_discard(struct scsi_disk *sdkp, unsigned int mode)
+void sd_config_discard(struct scsi_disk *sdkp, unsigned int mode)
 {
 	struct request_queue *q = sdkp->disk->queue;
 	unsigned int logical_block_size = sdkp->device->sector_size;
@@ -683,6 +691,11 @@ static void sd_config_discard(struct scsi_disk *sdkp, unsigned int mode)
 		q->limits.discard_zeroes_data = sdkp->lbprz;
 		break;
 
+	case SD_ZBC_RESET_WP:
+		max_blocks = min_not_zero(sdkp->max_unmap_blocks,
+					  (u32)SD_MAX_WS16_BLOCKS);
+		break;
+
 	case SD_LBP_ZERO:
 		max_blocks = min_not_zero(sdkp->max_ws_blocks,
 					  (u32)SD_MAX_WS10_BLOCKS);
@@ -711,16 +724,20 @@ static int sd_setup_discard_cmnd(struct scsi_cmnd *cmd)
 	unsigned int nr_sectors = blk_rq_sectors(rq);
 	unsigned int nr_bytes = blk_rq_bytes(rq);
 	unsigned int len;
-	int ret;
+	int ret = BLKPREP_OK;
 	char *buf;
-	struct page *page;
+	struct page *page = NULL;
 
 	sector >>= ilog2(sdp->sector_size) - 9;
 	nr_sectors >>= ilog2(sdp->sector_size) - 9;
 
-	page = alloc_page(GFP_ATOMIC | __GFP_ZERO);
-	if (!page)
-		return BLKPREP_DEFER;
+	if (sdkp->provisioning_mode != SD_ZBC_RESET_WP) {
+		page = alloc_page(GFP_ATOMIC | __GFP_ZERO);
+		if (!page)
+			return BLKPREP_DEFER;
+	}
+
+	rq->completion_data = page;
 
 	switch (sdkp->provisioning_mode) {
 	case SD_LBP_UNMAP:
@@ -760,12 +777,19 @@ static int sd_setup_discard_cmnd(struct scsi_cmnd *cmd)
 		len = sdkp->device->sector_size;
 		break;
 
+	case SD_ZBC_RESET_WP:
+		ret = sd_zbc_setup_reset_cmnd(cmd);
+		if (ret != BLKPREP_OK)
+			goto out;
+		/* Reset Write Pointer doesn't have a payload */
+		len = 0;
+		break;
+
 	default:
 		ret = BLKPREP_INVALID;
 		goto out;
 	}
 
-	rq->completion_data = page;
 	rq->timeout = SD_TIMEOUT;
 
 	cmd->transfersize = len;
@@ -779,13 +803,17 @@ static int sd_setup_discard_cmnd(struct scsi_cmnd *cmd)
 	 * discarded on disk. This allows us to report completion on the full
 	 * amount of blocks described by the request.
 	 */
-	blk_add_request_payload(rq, page, 0, len);
-	ret = scsi_init_io(cmd);
+	if (len) {
+		blk_add_request_payload(rq, page, 0, len);
+		ret = scsi_init_io(cmd);
+	}
 	rq->__data_len = nr_bytes;
 
 out:
-	if (ret != BLKPREP_OK)
+	if (page && ret != BLKPREP_OK) {
+		rq->completion_data = NULL;
 		__free_page(page);
+	}
 	return ret;
 }
 
@@ -843,6 +871,13 @@ static int sd_setup_write_same_cmnd(struct scsi_cmnd *cmd)
 
 	BUG_ON(bio_offset(bio) || bio_iovec(bio).bv_len != sdp->sector_size);
 
+	if (sdkp->zoned == 1 || sdp->type == TYPE_ZBC) {
+		/* sd_zbc_setup_read_write uses block layer sector units */
+		ret = sd_zbc_setup_read_write(sdkp, rq, sector, &nr_sectors);
+		if (ret != BLKPREP_OK)
+			return ret;
+	}
+
 	sector >>= ilog2(sdp->sector_size) - 9;
 	nr_sectors >>= ilog2(sdp->sector_size) - 9;
 
@@ -962,6 +997,13 @@ static int sd_setup_read_write_cmnd(struct scsi_cmnd *SCpnt)
 	SCSI_LOG_HLQUEUE(2, scmd_printk(KERN_INFO, SCpnt, "block=%llu\n",
 					(unsigned long long)block));
 
+	if (sdkp->zoned == 1 || sdp->type == TYPE_ZBC) {
+		/* sd_zbc_setup_read_write uses block layer sector units */
+		ret = sd_zbc_setup_read_write(sdkp, rq, block, &this_count);
+		if (ret != BLKPREP_OK)
+			goto out;
+	}
+
 	/*
 	 * If we have a 1K hardware sectorsize, prevent access to single
 	 * 512 byte sectors.  In theory we could handle this - in fact
@@ -1148,6 +1190,16 @@ static int sd_init_command(struct scsi_cmnd *cmd)
 	case REQ_OP_READ:
 	case REQ_OP_WRITE:
 		return sd_setup_read_write_cmnd(cmd);
+	case REQ_OP_ZONE_REPORT:
+		return sd_zbc_setup_report_cmnd(cmd);
+	case REQ_OP_ZONE_RESET:
+		return sd_zbc_setup_reset_cmnd(cmd);
+	case REQ_OP_ZONE_OPEN:
+		return sd_zbc_setup_open_cmnd(cmd);
+	case REQ_OP_ZONE_CLOSE:
+		return sd_zbc_setup_close_cmnd(cmd);
+	case REQ_OP_ZONE_FINISH:
+		return sd_zbc_setup_finish_cmnd(cmd);
 	default:
 		BUG();
 	}
@@ -1157,7 +1209,8 @@ static void sd_uninit_command(struct scsi_cmnd *SCpnt)
 {
 	struct request *rq = SCpnt->request;
 
-	if (req_op(rq) == REQ_OP_DISCARD)
+	if (req_op(rq) == REQ_OP_DISCARD &&
+	    rq->completion_data)
 		__free_page(rq->completion_data);
 
 	if (SCpnt->cmnd != rq->cmd) {
@@ -1778,8 +1831,16 @@ static int sd_done(struct scsi_cmnd *SCpnt)
 	int sense_deferred = 0;
 	unsigned char op = SCpnt->cmnd[0];
 	unsigned char unmap = SCpnt->cmnd[1] & 8;
+	unsigned char sa = SCpnt->cmnd[1] & 0xf;
 
-	if (req_op(req) == REQ_OP_DISCARD || req_op(req) == REQ_OP_WRITE_SAME) {
+	switch(req_op(req)) {
+	case REQ_OP_DISCARD:
+	case REQ_OP_WRITE_SAME:
+	case REQ_OP_ZONE_REPORT:
+	case REQ_OP_ZONE_RESET:
+	case REQ_OP_ZONE_OPEN:
+	case REQ_OP_ZONE_CLOSE:
+	case REQ_OP_ZONE_FINISH:
 		if (!result) {
 			good_bytes = blk_rq_bytes(req);
 			scsi_set_resid(SCpnt, 0);
@@ -1787,6 +1848,7 @@ static int sd_done(struct scsi_cmnd *SCpnt)
 			good_bytes = 0;
 			scsi_set_resid(SCpnt, blk_rq_bytes(req));
 		}
+		break;
 	}
 
 	if (result) {
@@ -1829,6 +1891,10 @@ static int sd_done(struct scsi_cmnd *SCpnt)
 			case UNMAP:
 				sd_config_discard(sdkp, SD_LBP_DISABLE);
 				break;
+			case ZBC_OUT:
+				if (sa == ZO_RESET_WRITE_POINTER)
+					sd_config_discard(sdkp, SD_LBP_DISABLE);
+				break;
 			case WRITE_SAME_16:
 			case WRITE_SAME:
 				if (unmap)
@@ -1847,7 +1913,11 @@ static int sd_done(struct scsi_cmnd *SCpnt)
 	default:
 		break;
 	}
+
  out:
+	if (sdkp->zoned == 1 || sdkp->device->type == TYPE_ZBC)
+		sd_zbc_done(SCpnt, &sshdr);
+
 	SCSI_LOG_HLCOMPLETE(1, scmd_printk(KERN_INFO, SCpnt,
 					   "sd_done: completed %d of %d bytes\n",
 					   good_bytes, scsi_bufflen(SCpnt)));
@@ -1982,7 +2052,6 @@ sd_spinup_disk(struct scsi_disk *sdkp)
 	}
 }
 
-
 /*
  * Determine whether disk supports Data Integrity Field.
  */
@@ -2132,6 +2201,9 @@ static int read_capacity_16(struct scsi_disk *sdkp, struct scsi_device *sdp,
 	/* Logical blocks per physical block exponent */
 	sdkp->physical_block_size = (1 << (buffer[13] & 0xf)) * sector_size;
 
+	/* RC basis */
+	sdkp->rc_basis = (buffer[12] >> 4) & 0x3;
+
 	/* Lowest aligned logical block */
 	alignment = ((buffer[14] & 0x3f) << 8 | buffer[15]) * sector_size;
 	blk_queue_alignment_offset(sdp->request_queue, alignment);
@@ -2322,6 +2394,11 @@ got_data:
 		sector_size = 512;
 	}
 	blk_queue_logical_block_size(sdp->request_queue, sector_size);
+	blk_queue_physical_block_size(sdp->request_queue,
+				      sdkp->physical_block_size);
+	sdkp->device->sector_size = sector_size;
+
+	sd_zbc_read_zones(sdkp, buffer);
 
 	{
 		char cap_str_2[10], cap_str_10[10];
@@ -2348,9 +2425,6 @@ got_data:
 	if (sdkp->capacity > 0xffffffff)
 		sdp->use_16_for_rw = 1;
 
-	blk_queue_physical_block_size(sdp->request_queue,
-				      sdkp->physical_block_size);
-	sdkp->device->sector_size = sector_size;
 }
 
 /* called with buffer of length 512 */
@@ -2612,7 +2686,7 @@ static void sd_read_app_tag_own(struct scsi_disk *sdkp, unsigned char *buffer)
 	struct scsi_mode_data data;
 	struct scsi_sense_hdr sshdr;
 
-	if (sdp->type != TYPE_DISK)
+	if (sdp->type != TYPE_DISK && sdp->type != TYPE_ZBC)
 		return;
 
 	if (sdkp->protection_type == 0)
@@ -2719,6 +2793,7 @@ static void sd_read_block_limits(struct scsi_disk *sdkp)
  */
 static void sd_read_block_characteristics(struct scsi_disk *sdkp)
 {
+	struct request_queue *q = sdkp->disk->queue;
 	unsigned char *buffer;
 	u16 rot;
 	const int vpd_len = 64;
@@ -2733,10 +2808,21 @@ static void sd_read_block_characteristics(struct scsi_disk *sdkp)
 	rot = get_unaligned_be16(&buffer[4]);
 
 	if (rot == 1) {
-		queue_flag_set_unlocked(QUEUE_FLAG_NONROT, sdkp->disk->queue);
-		queue_flag_clear_unlocked(QUEUE_FLAG_ADD_RANDOM, sdkp->disk->queue);
+		queue_flag_set_unlocked(QUEUE_FLAG_NONROT, q);
+		queue_flag_clear_unlocked(QUEUE_FLAG_ADD_RANDOM, q);
 	}
 
+	sdkp->zoned = (buffer[8] >> 4) & 3;
+	if (sdkp->zoned == 1)
+		q->limits.zoned = BLK_ZONED_HA;
+	else if (sdkp->device->type == TYPE_ZBC)
+		q->limits.zoned = BLK_ZONED_HM;
+	else
+		q->limits.zoned = BLK_ZONED_NONE;
+	if (blk_queue_zoned(q) && sdkp->first_scan)
+		sd_printk(KERN_NOTICE, sdkp, "Host-%s zoned block device\n",
+			  q->limits.zoned == BLK_ZONED_HM ? "managed" : "aware");
+
  out:
 	kfree(buffer);
 }
@@ -2835,14 +2921,14 @@ static int sd_revalidate_disk(struct gendisk *disk)
 	 * react badly if we do.
 	 */
 	if (sdkp->media_present) {
-		sd_read_capacity(sdkp, buffer);
-
 		if (scsi_device_supports_vpd(sdp)) {
 			sd_read_block_provisioning(sdkp);
 			sd_read_block_limits(sdkp);
 			sd_read_block_characteristics(sdkp);
 		}
 
+		sd_read_capacity(sdkp, buffer);
+
 		sd_read_write_protect_flag(sdkp, buffer);
 		sd_read_cache_type(sdkp, buffer);
 		sd_read_app_tag_own(sdkp, buffer);
@@ -3040,9 +3126,16 @@ static int sd_probe(struct device *dev)
 
 	scsi_autopm_get_device(sdp);
 	error = -ENODEV;
-	if (sdp->type != TYPE_DISK && sdp->type != TYPE_MOD && sdp->type != TYPE_RBC)
+	if (sdp->type != TYPE_DISK &&
+	    sdp->type != TYPE_ZBC &&
+	    sdp->type != TYPE_MOD &&
+	    sdp->type != TYPE_RBC)
 		goto out;
 
+#ifndef CONFIG_BLK_DEV_ZONED
+	if (sdp->type == TYPE_ZBC)
+		goto out;
+#endif
 	SCSI_LOG_HLQUEUE(3, sdev_printk(KERN_INFO, sdp,
 					"sd_probe\n"));
 
@@ -3146,6 +3239,8 @@ static int sd_remove(struct device *dev)
 	del_gendisk(sdkp->disk);
 	sd_shutdown(dev);
 
+	sd_zbc_remove(sdkp);
+
 	blk_register_region(devt, SD_MINORS, NULL,
 			    sd_default_probe, NULL, NULL);
 
diff --git a/drivers/scsi/sd.h b/drivers/scsi/sd.h
index 765a6f1..3452871 100644
--- a/drivers/scsi/sd.h
+++ b/drivers/scsi/sd.h
@@ -56,6 +56,7 @@ enum {
 	SD_LBP_WS16,		/* Use WRITE SAME(16) with UNMAP bit */
 	SD_LBP_WS10,		/* Use WRITE SAME(10) with UNMAP bit */
 	SD_LBP_ZERO,		/* Use WRITE SAME(10) with zero payload */
+	SD_ZBC_RESET_WP,	/* Use RESET WRITE POINTER */
 	SD_LBP_DISABLE,		/* Discard disabled due to failed cmd */
 };
 
@@ -64,6 +65,11 @@ struct scsi_disk {
 	struct scsi_device *device;
 	struct device	dev;
 	struct gendisk	*disk;
+#ifdef CONFIG_BLK_DEV_ZONED
+	struct workqueue_struct *zone_work_q;
+	sector_t zone_sectors;
+	unsigned int nr_zones;
+#endif
 	atomic_t	openers;
 	sector_t	capacity;	/* size in logical blocks */
 	u32		max_xfer_blocks;
@@ -94,6 +100,8 @@ struct scsi_disk {
 	unsigned	lbpvpd : 1;
 	unsigned	ws10 : 1;
 	unsigned	ws16 : 1;
+	unsigned	rc_basis: 2;
+	unsigned	zoned: 2;
 };
 #define to_scsi_disk(obj) container_of(obj,struct scsi_disk,dev)
 
@@ -156,6 +164,13 @@ static inline unsigned int logical_to_bytes(struct scsi_device *sdev, sector_t b
 	return blocks * sdev->sector_size;
 }
 
+static inline sector_t sectors_to_logical(struct scsi_device *sdev, sector_t sector)
+{
+	return sector >> (ilog2(sdev->sector_size) - 9);
+}
+
+extern void sd_config_discard(struct scsi_disk *, unsigned int);
+
 /*
  * A DIF-capable target device can be formatted with different
  * protection schemes.  Currently 0 through 3 are defined:
@@ -269,4 +284,57 @@ static inline void sd_dif_complete(struct scsi_cmnd *cmd, unsigned int a)
 
 #endif /* CONFIG_BLK_DEV_INTEGRITY */
 
+#ifdef CONFIG_BLK_DEV_ZONED
+
+extern void sd_zbc_read_zones(struct scsi_disk *, char *);
+extern void sd_zbc_remove(struct scsi_disk *);
+extern int sd_zbc_setup_read_write(struct scsi_disk *, struct request *,
+				   sector_t, unsigned int *);
+extern int sd_zbc_setup_report_cmnd(struct scsi_cmnd *);
+extern int sd_zbc_setup_reset_cmnd(struct scsi_cmnd *);
+extern int sd_zbc_setup_open_cmnd(struct scsi_cmnd *);
+extern int sd_zbc_setup_close_cmnd(struct scsi_cmnd *);
+extern int sd_zbc_setup_finish_cmnd(struct scsi_cmnd *);
+extern void sd_zbc_done(struct scsi_cmnd *, struct scsi_sense_hdr *);
+
+#else /* CONFIG_BLK_DEV_ZONED */
+
+static inline void sd_zbc_read_zones(struct scsi_disk *sdkp,
+				     unsigned char *buf) {}
+static inline void sd_zbc_remove(struct scsi_disk *sdkp) {}
+
+static inline int sd_zbc_setup_read_write(struct scsi_disk *sdkp,
+					  struct request *rq, sector_t sector,
+					  unsigned int *num_sectors)
+{
+	/* Let the drive fail requests */
+	return BLKPREP_OK;
+}
+
+static inline int sd_zbc_setup_report_cmnd(struct scsi_cmnd *cmd)
+{
+	return BLKPREP_KILL;
+}
+static inline int sd_zbc_setup_reset_cmnd(struct scsi_cmnd *cmd)
+{
+	return BLKPREP_KILL;
+}
+static inline int sd_zbc_setup_open_cmnd(struct scsi_cmnd *cmd)
+{
+	return BLKPREP_KILL;
+}
+static inline int sd_zbc_setup_close_cmnd(struct scsi_cmnd *cmd)
+{
+	return BLKPREP_KILL;
+}
+static inline int sd_zbc_setup_finish_cmnd(struct scsi_cmnd *cmd)
+{
+	return BLKPREP_KILL;
+}
+
+static inline void sd_zbc_done(struct scsi_cmnd *cmd,
+			       struct scsi_sense_hdr *sshdr) {}
+
+#endif /* CONFIG_BLK_DEV_ZONED */
+
 #endif /* _SCSI_DISK_H */
diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
new file mode 100644
index 0000000..ec9c3fc
--- /dev/null
+++ b/drivers/scsi/sd_zbc.c
@@ -0,0 +1,1097 @@
+/*
+ * SCSI Zoned Block commands
+ *
+ * Copyright (C) 2014-2015 SUSE Linux GmbH
+ * Written by: Hannes Reinecke <hare@suse.de>
+ * Modified by: Damien Le Moal <damien.lemoal@hgst.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License version
+ * 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; see the file COPYING.  If not, write to
+ * the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139,
+ * USA.
+ *
+ */
+
+#include <linux/blkdev.h>
+#include <linux/rbtree.h>
+
+#include <asm/unaligned.h>
+
+#include <scsi/scsi.h>
+#include <scsi/scsi_cmnd.h>
+#include <scsi/scsi_dbg.h>
+#include <scsi/scsi_device.h>
+#include <scsi/scsi_driver.h>
+#include <scsi/scsi_host.h>
+#include <scsi/scsi_eh.h>
+
+#include "sd.h"
+#include "scsi_priv.h"
+
+enum zbc_zone_type {
+	ZBC_ZONE_TYPE_CONV = 0x1,
+	ZBC_ZONE_TYPE_SEQWRITE_REQ,
+	ZBC_ZONE_TYPE_SEQWRITE_PREF,
+	ZBC_ZONE_TYPE_RESERVED,
+};
+
+enum zbc_zone_cond {
+	ZBC_ZONE_COND_NO_WP,
+	ZBC_ZONE_COND_EMPTY,
+	ZBC_ZONE_COND_IMP_OPEN,
+	ZBC_ZONE_COND_EXP_OPEN,
+	ZBC_ZONE_COND_CLOSED,
+	ZBC_ZONE_COND_READONLY = 0xd,
+	ZBC_ZONE_COND_FULL,
+	ZBC_ZONE_COND_OFFLINE,
+};
+
+#define SD_ZBC_BUF_SIZE 131072
+
+#define sd_zbc_debug(sdkp, fmt, args...)			\
+	pr_debug("%s %s [%s]: " fmt,				\
+		 dev_driver_string(&(sdkp)->device->sdev_gendev), \
+		 dev_name(&(sdkp)->device->sdev_gendev),	 \
+		 (sdkp)->disk->disk_name, ## args)
+
+#define sd_zbc_debug_ratelimit(sdkp, fmt, args...)		\
+	do {							\
+		if (printk_ratelimit())				\
+			sd_zbc_debug(sdkp, fmt, ## args);	\
+	} while( 0 )
+
+#define sd_zbc_err(sdkp, fmt, args...)				\
+	pr_err("%s %s [%s]: " fmt,				\
+	       dev_driver_string(&(sdkp)->device->sdev_gendev),	\
+	       dev_name(&(sdkp)->device->sdev_gendev),		\
+	       (sdkp)->disk->disk_name, ## args)
+
+struct zbc_zone_work {
+	struct work_struct 	zone_work;
+	struct scsi_disk 	*sdkp;
+	sector_t		sector;
+	sector_t		nr_sects;
+	bool 			init;
+	unsigned int		nr_zones;
+};
+
+struct blk_zone *zbc_desc_to_zone(struct scsi_disk *sdkp, unsigned char *rec)
+{
+	struct blk_zone *zone;
+
+	zone = kzalloc(sizeof(struct blk_zone), GFP_KERNEL);
+	if (!zone)
+		return NULL;
+
+	/* Zone type */
+	switch(rec[0] & 0x0f) {
+	case ZBC_ZONE_TYPE_CONV:
+	case ZBC_ZONE_TYPE_SEQWRITE_REQ:
+	case ZBC_ZONE_TYPE_SEQWRITE_PREF:
+		zone->type = rec[0] & 0x0f;
+		break;
+	default:
+		zone->type = BLK_ZONE_TYPE_UNKNOWN;
+		break;
+	}
+
+	/* Zone condition */
+	zone->cond = (rec[1] >> 4) & 0xf;
+	if (rec[1] & 0x01)
+		zone->reset = 1;
+	if (rec[1] & 0x02)
+		zone->non_seq = 1;
+
+	/* Zone start sector and length */
+	zone->len = logical_to_sectors(sdkp->device,
+				       get_unaligned_be64(&rec[8]));
+	zone->start = logical_to_sectors(sdkp->device,
+					 get_unaligned_be64(&rec[16]));
+
+	/* Zone write pointer */
+	if (blk_zone_is_empty(zone) &&
+	    zone->wp != zone->start)
+		zone->wp = zone->start;
+	else if (blk_zone_is_full(zone))
+		zone->wp = zone->start + zone->len;
+	else if (blk_zone_is_seq(zone))
+		zone->wp = logical_to_sectors(sdkp->device,
+					      get_unaligned_be64(&rec[24]));
+	else
+		zone->wp = (sector_t)-1;
+
+	return zone;
+}
+
+static int zbc_parse_zones(struct scsi_disk *sdkp, unsigned char *buf,
+			   unsigned int buf_len, sector_t *next_sector)
+{
+	struct request_queue *q = sdkp->disk->queue;
+	sector_t capacity = logical_to_sectors(sdkp->device, sdkp->capacity);
+	unsigned char *rec = buf;
+	unsigned int zone_len, list_length;
+
+	/* Parse REPORT ZONES header */
+	list_length = get_unaligned_be32(&buf[0]);
+	rec = buf + 64;
+	list_length += 64;
+
+	if (list_length < buf_len)
+		buf_len = list_length;
+
+	/* Parse REPORT ZONES zone descriptors */
+	*next_sector = capacity;
+	while (rec < buf + buf_len) {
+
+		struct blk_zone *new, *old;
+
+		new = zbc_desc_to_zone(sdkp, rec);
+		if (!new)
+			return -ENOMEM;
+
+		zone_len = new->len;
+		*next_sector = new->start + zone_len;
+
+		old = blk_insert_zone(q, new);
+		if (old) {
+			blk_lock_zone(old);
+
+			/*
+			 * Always update the zone state flags and the zone
+			 * offline and read-only condition as the drive may
+			 * change those independently of the commands being
+			 * executed
+			 */
+			old->reset = new->reset;
+			old->non_seq = new->non_seq;
+			if (blk_zone_is_offline(new) ||
+			    blk_zone_is_readonly(new))
+				old->cond = new->cond;
+
+			if (blk_zone_in_update(old)) {
+				old->cond = new->cond;
+				old->wp = new->wp;
+				blk_clear_zone_update(old);
+			}
+
+			blk_unlock_zone(old);
+
+			kfree(new);
+		}
+
+		rec += 64;
+
+	}
+
+	return 0;
+}
+
+/**
+ * sd_zbc_report_zones - Issue a REPORT ZONES scsi command
+ * @sdkp: SCSI disk to which the command should be send
+ * @buffer: response buffer
+ * @bufflen: length of @buffer
+ * @start_sector: logical sector for the zone information should be reported
+ * @option: reporting option to be used
+ * @partial: flag to set the 'partial' bit for report zones command
+ */
+int sd_zbc_report_zones(struct scsi_disk *sdkp, unsigned char *buffer,
+			int bufflen, sector_t start_sector,
+			enum zbc_zone_reporting_options option, bool partial)
+{
+	struct scsi_device *sdp = sdkp->device;
+	const int timeout = sdp->request_queue->rq_timeout;
+	struct scsi_sense_hdr sshdr;
+	sector_t start_lba = sectors_to_logical(sdkp->device, start_sector);
+	unsigned char cmd[16];
+	int result;
+
+	if (!scsi_device_online(sdp))
+		return -ENODEV;
+
+	sd_zbc_debug(sdkp, "REPORT ZONES lba %zu len %d\n",
+		     start_lba, bufflen);
+
+	memset(cmd, 0, 16);
+	cmd[0] = ZBC_IN;
+	cmd[1] = ZI_REPORT_ZONES;
+	put_unaligned_be64(start_lba, &cmd[2]);
+	put_unaligned_be32(bufflen, &cmd[10]);
+	cmd[14] = (partial ? ZBC_REPORT_ZONE_PARTIAL : 0) | option;
+	memset(buffer, 0, bufflen);
+
+	result = scsi_execute_req(sdp, cmd, DMA_FROM_DEVICE,
+				buffer, bufflen, &sshdr,
+				timeout, SD_MAX_RETRIES, NULL);
+
+	if (result) {
+		sd_zbc_err(sdkp,
+			   "REPORT ZONES lba %zu failed with %d/%d\n",
+			   start_lba, host_byte(result), driver_byte(result));
+		return -EIO;
+	}
+
+	return 0;
+}
+
+/**
+ * Set or clear the update flag of all zones contained
+ * in the range sector..sector+nr_sects.
+ * Return the number of zones marked/cleared.
+ */
+static int __sd_zbc_zones_updating(struct scsi_disk *sdkp,
+				   sector_t sector, sector_t nr_sects,
+				   bool set)
+{
+	struct request_queue *q = sdkp->disk->queue;
+	struct blk_zone *zone;
+	struct rb_node *node;
+	unsigned long flags;
+	int nr_zones = 0;
+
+	if (!nr_sects) {
+		/* All zones */
+		sector = 0;
+		nr_sects = logical_to_sectors(sdkp->device, sdkp->capacity);
+	}
+
+	spin_lock_irqsave(&q->zones_lock, flags);
+	for (node = rb_first(&q->zones); node && nr_sects; node = rb_next(node)) {
+		zone = rb_entry(node, struct blk_zone, node);
+		if (sector < zone->start || sector >= (zone->start + zone->len))
+			continue;
+		if (set) {
+			if (!test_and_set_bit_lock(BLK_ZONE_IN_UPDATE, &zone->flags))
+				nr_zones++;
+		} else if (test_and_clear_bit(BLK_ZONE_IN_UPDATE, &zone->flags)) {
+			wake_up_bit(&zone->flags, BLK_ZONE_IN_UPDATE);
+			nr_zones++;
+		}
+		sector = zone->start + zone->len;
+		if (nr_sects <= zone->len)
+			nr_sects = 0;
+		else
+			nr_sects -= zone->len;
+	}
+	spin_unlock_irqrestore(&q->zones_lock, flags);
+
+	return nr_zones;
+}
+
+static inline int sd_zbc_set_zones_updating(struct scsi_disk *sdkp,
+					    sector_t sector, sector_t nr_sects)
+{
+	return __sd_zbc_zones_updating(sdkp, sector, nr_sects, true);
+}
+
+static inline int sd_zbc_clear_zones_updating(struct scsi_disk *sdkp,
+					      sector_t sector, sector_t nr_sects)
+{
+	return __sd_zbc_zones_updating(sdkp, sector, nr_sects, false);
+}
+
+static void sd_zbc_start_queue(struct request_queue *q)
+{
+	unsigned long flags;
+
+	if (q->mq_ops) {
+		blk_mq_start_hw_queues(q);
+	} else {
+		spin_lock_irqsave(q->queue_lock, flags);
+		blk_start_queue(q);
+		spin_unlock_irqrestore(q->queue_lock, flags);
+	}
+}
+
+static void sd_zbc_update_zone_work(struct work_struct *work)
+{
+	struct zbc_zone_work *zwork =
+		container_of(work, struct zbc_zone_work, zone_work);
+	struct scsi_disk *sdkp = zwork->sdkp;
+	sector_t capacity = logical_to_sectors(sdkp->device, sdkp->capacity);
+	struct request_queue *q = sdkp->disk->queue;
+	sector_t end_sector, sector = zwork->sector;
+	unsigned int bufsize;
+	unsigned char *buf;
+	int ret = -ENOMEM;
+
+	/* Get a buffer */
+	if (!zwork->nr_zones) {
+		bufsize = SD_ZBC_BUF_SIZE;
+	} else {
+		bufsize = (zwork->nr_zones + 1) * 64;
+		if (bufsize < 512)
+			bufsize = 512;
+		else if (bufsize > SD_ZBC_BUF_SIZE)
+				bufsize = SD_ZBC_BUF_SIZE;
+		else
+			bufsize = (bufsize + 511) & ~511;
+	}
+	buf = kmalloc(bufsize, GFP_KERNEL | GFP_DMA);
+	if (!buf) {
+		sd_zbc_err(sdkp, "Failed to allocate zone report buffer\n");
+		goto done_free;
+	}
+
+	/* Process sector range */
+	end_sector = zwork->sector + zwork->nr_sects;
+	while(sector < min(end_sector, capacity)) {
+
+		/* Get zone report */
+		ret = sd_zbc_report_zones(sdkp, buf, bufsize, sector,
+					  ZBC_ZONE_REPORTING_OPTION_ALL, true);
+		if (ret)
+			break;
+
+		ret = zbc_parse_zones(sdkp, buf, bufsize, &sector);
+		if (ret)
+			break;
+
+		/* Kick start the queue to allow requests waiting */
+		/* for the zones just updated to run              */
+		sd_zbc_start_queue(q);
+
+	}
+
+done_free:
+	if (ret)
+		sd_zbc_clear_zones_updating(sdkp, zwork->sector, zwork->nr_sects);
+	if (buf)
+		kfree(buf);
+	kfree(zwork);
+}
+
+/**
+ * sd_zbc_update_zones - Update zone information for zones starting
+ * from @start_sector. If not in init mode, the update is done only
+ * for zones marked with update flag.
+ * @sdkp: SCSI disk for which the zone information needs to be updated
+ * @start_sector: First sector of the first zone to be updated
+ * @bufsize: buffersize to be allocated for report zones
+ */
+static int sd_zbc_update_zones(struct scsi_disk *sdkp,
+			       sector_t sector, sector_t nr_sects,
+			       gfp_t gfpflags, bool init)
+{
+	struct zbc_zone_work *zwork;
+
+	zwork = kzalloc(sizeof(struct zbc_zone_work), gfpflags);
+	if (!zwork) {
+		sd_zbc_err(sdkp, "Failed to allocate zone work\n");
+		return -ENOMEM;
+	}
+
+	if (!nr_sects) {
+		/* All zones */
+		sector = 0;
+		nr_sects = logical_to_sectors(sdkp->device, sdkp->capacity);
+	}
+
+	INIT_WORK(&zwork->zone_work, sd_zbc_update_zone_work);
+	zwork->sdkp = sdkp;
+	zwork->sector = sector;
+	zwork->nr_sects = nr_sects;
+	zwork->init = init;
+
+	if (!init)
+		/* Mark the zones falling in the report as updating */
+		zwork->nr_zones = sd_zbc_set_zones_updating(sdkp, sector, nr_sects);
+
+	if (init || zwork->nr_zones)
+		queue_work(sdkp->zone_work_q, &zwork->zone_work);
+	else
+		kfree(zwork);
+
+	return 0;
+}
+
+int sd_zbc_setup_report_cmnd(struct scsi_cmnd *cmd)
+{
+	struct request *rq = cmd->request;
+	struct gendisk *disk = rq->rq_disk;
+	struct scsi_disk *sdkp = scsi_disk(disk);
+	int ret;
+
+	if (!sdkp->zone_work_q)
+		return BLKPREP_KILL;
+
+	ret = sd_zbc_update_zones(sdkp, blk_rq_pos(rq), blk_rq_sectors(rq),
+				  GFP_ATOMIC, false);
+	if (unlikely(ret))
+		return BLKPREP_DEFER;
+
+	return BLKPREP_DONE;
+}
+
+static void sd_zbc_setup_action_cmnd(struct scsi_cmnd *cmd,
+				     u8 action,
+				     bool all)
+{
+	struct request *rq = cmd->request;
+	struct scsi_disk *sdkp = scsi_disk(rq->rq_disk);
+	sector_t lba;
+
+	cmd->cmd_len = 16;
+	cmd->cmnd[0] = ZBC_OUT;
+	cmd->cmnd[1] = action;
+	if (all) {
+		cmd->cmnd[14] |= 0x01;
+	} else {
+		lba = sectors_to_logical(sdkp->device, blk_rq_pos(rq));
+		put_unaligned_be64(lba, &cmd->cmnd[2]);
+	}
+
+	rq->completion_data = NULL;
+	rq->timeout = SD_TIMEOUT;
+	rq->__data_len = blk_rq_bytes(rq);
+
+	/* Don't retry */
+	cmd->allowed = 0;
+	cmd->transfersize = 0;
+	cmd->sc_data_direction = DMA_NONE;
+}
+
+int sd_zbc_setup_reset_cmnd(struct scsi_cmnd *cmd)
+{
+	struct request *rq = cmd->request;
+	struct scsi_disk *sdkp = scsi_disk(rq->rq_disk);
+	sector_t sector = blk_rq_pos(rq);
+	sector_t nr_sects = blk_rq_sectors(rq);
+	struct blk_zone *zone = NULL;
+	int ret = BLKPREP_OK;
+
+	if (nr_sects) {
+		zone = blk_lookup_zone(rq->q, sector);
+		if (!zone)
+			return BLKPREP_KILL;
+	}
+
+	if (zone) {
+
+		blk_lock_zone(zone);
+
+		/* If the zone is being updated, wait */
+		if (blk_zone_in_update(zone)) {
+			ret = BLKPREP_DEFER;
+			goto out;
+		}
+
+		if (zone->type == BLK_ZONE_TYPE_UNKNOWN) {
+			sd_zbc_debug(sdkp,
+				     "Discarding unknown zone %zu\n",
+				     zone->start);
+			ret = BLKPREP_KILL;
+			goto out;
+		}
+
+		/* Nothing to do for conventional sequential zones */
+		if (blk_zone_is_conv(zone)) {
+			ret = BLKPREP_DONE;
+			goto out;
+		}
+
+		if (!blk_try_write_lock_zone(zone)) {
+			ret = BLKPREP_DEFER;
+			goto out;
+		}
+
+		/* Nothing to do if the zone is already empty */
+		if (blk_zone_is_empty(zone)) {
+			blk_write_unlock_zone(zone);
+			ret = BLKPREP_DONE;
+			goto out;
+		}
+
+		if (sector != zone->start ||
+		    (nr_sects != zone->len)) {
+			sd_printk(KERN_ERR, sdkp,
+				  "Unaligned reset wp request, start %zu/%zu"
+				  " len %zu/%zu\n",
+				  zone->start, sector, zone->len, nr_sects);
+			blk_write_unlock_zone(zone);
+			ret = BLKPREP_KILL;
+			goto out;
+		}
+
+	}
+
+	sd_zbc_setup_action_cmnd(cmd, ZO_RESET_WRITE_POINTER, !zone);
+
+out:
+	if (zone) {
+		if (ret == BLKPREP_OK) {
+			/*
+			 * Opportunistic update. Will be fixed up
+			 * with zone update if the command fails,
+			 */
+			zone->wp = zone->start;
+			zone->cond = BLK_ZONE_COND_EMPTY;
+			zone->reset = 0;
+			zone->non_seq = 0;
+		}
+		blk_unlock_zone(zone);
+	}
+
+	return ret;
+}
+
+int sd_zbc_setup_open_cmnd(struct scsi_cmnd *cmd)
+{
+	struct request *rq = cmd->request;
+	struct scsi_disk *sdkp = scsi_disk(rq->rq_disk);
+	sector_t sector = blk_rq_pos(rq);
+	sector_t nr_sects = blk_rq_sectors(rq);
+	struct blk_zone *zone = NULL;
+	int ret = BLKPREP_OK;
+
+	if (nr_sects) {
+		zone = blk_lookup_zone(rq->q, sector);
+		if (!zone)
+			return BLKPREP_KILL;
+	}
+
+	if (zone) {
+
+		blk_lock_zone(zone);
+
+		/* If the zone is being updated, wait */
+		if (blk_zone_in_update(zone)) {
+			ret = BLKPREP_DEFER;
+			goto out;
+		}
+
+		if (zone->type == BLK_ZONE_TYPE_UNKNOWN) {
+			sd_zbc_debug(sdkp,
+				     "Opening unknown zone %zu\n",
+				     zone->start);
+			ret = BLKPREP_KILL;
+			goto out;
+		}
+
+		/*
+		 * Nothing to do for conventional zones,
+		 * zones already open or full zones.
+		 */
+		if (blk_zone_is_conv(zone) ||
+		    blk_zone_is_open(zone) ||
+		    blk_zone_is_full(zone)) {
+			ret = BLKPREP_DONE;
+			goto out;
+		}
+
+		if (sector != zone->start ||
+		    (nr_sects != zone->len)) {
+			sd_printk(KERN_ERR, sdkp,
+				  "Unaligned open zone request, start %zu/%zu"
+				  " len %zu/%zu\n",
+				  zone->start, sector, zone->len, nr_sects);
+			ret = BLKPREP_KILL;
+			goto out;
+		}
+
+	}
+
+	sd_zbc_setup_action_cmnd(cmd, ZO_OPEN_ZONE, !zone);
+
+out:
+	if (zone) {
+		if (ret == BLKPREP_OK)
+			/*
+			 * Opportunistic update. Will be fixed up
+			 * with zone update if the command fails.
+			 */
+			zone->cond = BLK_ZONE_COND_EXP_OPEN;
+		blk_unlock_zone(zone);
+	}
+
+	return ret;
+}
+
+int sd_zbc_setup_close_cmnd(struct scsi_cmnd *cmd)
+{
+	struct request *rq = cmd->request;
+	struct scsi_disk *sdkp = scsi_disk(rq->rq_disk);
+	sector_t sector = blk_rq_pos(rq);
+	sector_t nr_sects = blk_rq_sectors(rq);
+	struct blk_zone *zone = NULL;
+	int ret = BLKPREP_OK;
+
+	if (nr_sects) {
+		zone = blk_lookup_zone(rq->q, sector);
+		if (!zone)
+			return BLKPREP_KILL;
+	}
+
+	if (zone) {
+
+		blk_lock_zone(zone);
+
+		/* If the zone is being updated, wait */
+		if (blk_zone_in_update(zone)) {
+			ret = BLKPREP_DEFER;
+			goto out;
+		}
+
+		if (zone->type == BLK_ZONE_TYPE_UNKNOWN) {
+			sd_zbc_debug(sdkp,
+				     "Closing unknown zone %zu\n",
+				     zone->start);
+			ret = BLKPREP_KILL;
+			goto out;
+		}
+
+		/*
+		 * Nothing to do for conventional zones,
+		 * full zones or empty zones.
+		 */
+		if (blk_zone_is_conv(zone) ||
+		    blk_zone_is_full(zone) ||
+		    blk_zone_is_empty(zone)) {
+			ret = BLKPREP_DONE;
+			goto out;
+		}
+
+		if (sector != zone->start ||
+		    (nr_sects != zone->len)) {
+			sd_printk(KERN_ERR, sdkp,
+				  "Unaligned close zone request, start %zu/%zu"
+				  " len %zu/%zu\n",
+				  zone->start, sector, zone->len, nr_sects);
+			ret = BLKPREP_KILL;
+			goto out;
+		}
+
+	}
+
+	sd_zbc_setup_action_cmnd(cmd, ZO_CLOSE_ZONE, !zone);
+
+out:
+	if (zone) {
+		if (ret == BLKPREP_OK)
+			/*
+			 * Opportunistic update. Will be fixed up
+			 * with zone update if the command fails.
+			 */
+			zone->cond = BLK_ZONE_COND_CLOSED;
+		blk_unlock_zone(zone);
+	}
+
+	return ret;
+}
+
+int sd_zbc_setup_finish_cmnd(struct scsi_cmnd *cmd)
+{
+	struct request *rq = cmd->request;
+	struct scsi_disk *sdkp = scsi_disk(rq->rq_disk);
+	sector_t sector = blk_rq_pos(rq);
+	sector_t nr_sects = blk_rq_sectors(rq);
+	struct blk_zone *zone = NULL;
+	int ret = BLKPREP_OK;
+
+	if (nr_sects) {
+		zone = blk_lookup_zone(rq->q, sector);
+		if (!zone)
+			return BLKPREP_KILL;
+	}
+
+	if (zone) {
+
+		blk_lock_zone(zone);
+
+		/* If the zone is being updated, wait */
+		if (blk_zone_in_update(zone)) {
+			ret = BLKPREP_DEFER;
+			goto out;
+		}
+
+		if (zone->type == BLK_ZONE_TYPE_UNKNOWN) {
+			sd_zbc_debug(sdkp,
+				     "Finishing unknown zone %zu\n",
+				     zone->start);
+			ret = BLKPREP_KILL;
+			goto out;
+		}
+
+		/* Nothing to do for conventional zones and full zones */
+		if (blk_zone_is_conv(zone) ||
+		    blk_zone_is_full(zone)) {
+			ret = BLKPREP_DONE;
+			goto out;
+		}
+
+		if (sector != zone->start ||
+		    (nr_sects != zone->len)) {
+			sd_printk(KERN_ERR, sdkp,
+				  "Unaligned finish zone request, start %zu/%zu"
+				  " len %zu/%zu\n",
+				  zone->start, sector, zone->len, nr_sects);
+			ret = BLKPREP_KILL;
+			goto out;
+		}
+
+	}
+
+	sd_zbc_setup_action_cmnd(cmd, ZO_FINISH_ZONE, !zone);
+
+out:
+	if (zone) {
+		if (ret == BLKPREP_OK) {
+			/*
+			 * Opportunistic update. Will be fixed up
+			 * with zone update if the command fails.
+			 */
+			zone->cond = BLK_ZONE_COND_FULL;
+			if (blk_zone_is_seq(zone))
+				zone->wp = zone->start + zone->len;
+		}
+		blk_unlock_zone(zone);
+	}
+
+	return ret;
+}
+
+int sd_zbc_setup_read_write(struct scsi_disk *sdkp, struct request *rq,
+			    sector_t sector, unsigned int *num_sectors)
+{
+	struct blk_zone *zone;
+	unsigned int sectors = *num_sectors;
+	int ret = BLKPREP_OK;
+
+	zone = blk_lookup_zone(rq->q, sector);
+	if (!zone)
+		/* Let the drive handle the request */
+		return BLKPREP_OK;
+
+	blk_lock_zone(zone);
+
+	/* If the zone is being updated, wait */
+	if (blk_zone_in_update(zone)) {
+		ret = BLKPREP_DEFER;
+		goto out;
+	}
+
+	if (zone->type == BLK_ZONE_TYPE_UNKNOWN) {
+		sd_zbc_debug(sdkp,
+			     "Unknown zone %zu\n",
+			     zone->start);
+		ret = BLKPREP_KILL;
+		goto out;
+	}
+
+	/* For offline and read-only zones, let the drive fail the command */
+	if (blk_zone_is_offline(zone) ||
+	    blk_zone_is_readonly(zone))
+		goto out;
+
+	/* Do not allow zone boundaries crossing */
+	if (sector + sectors > zone->start + zone->len) {
+		ret = BLKPREP_KILL;
+		goto out;
+	}
+
+	/* For conventional zones, no checks */
+	if (blk_zone_is_conv(zone))
+		goto out;
+
+	if (req_op(rq) == REQ_OP_WRITE ||
+	    req_op(rq) == REQ_OP_WRITE_SAME) {
+
+		/*
+		 * Write requests may change the write pointer and
+		 * transition the zone condition to full. Changes
+		 * are oportunistic here. If the request fails, a
+		 * zone update will fix the zone information.
+		 */
+		if (blk_zone_is_seq_req(zone)) {
+
+			/*
+			 * Do not issue more than one write at a time per
+			 * zone. This solves write ordering problems due to
+			 * the unlocking of the request queue in the dispatch
+			 * path in the non scsi-mq case. For scsi-mq, this
+			 * also avoids potential write reordering when multiple
+			 * threads running on different CPUs write to the same
+			 * zone (with a synchronized sequential pattern).
+			 */
+			if (!blk_try_write_lock_zone(zone)) {
+				ret = BLKPREP_DEFER;
+				goto out;
+			}
+
+			/* For host-managed drives, writes are allowed */
+			/* only at the write pointer position.         */
+			if (zone->wp != sector) {
+				blk_write_unlock_zone(zone);
+				ret = BLKPREP_KILL;
+				goto out;
+			}
+
+			zone->wp += sectors;
+			if (zone->wp >= zone->start + zone->len) {
+				zone->cond = BLK_ZONE_COND_FULL;
+				zone->wp = zone->start + zone->len;
+			}
+
+		} else {
+
+			/* For host-aware drives, writes are allowed */
+			/* anywhere in the zone, but wp can only go  */
+			/* forward.                                  */
+			sector_t end_sector = sector + sectors;
+			if (sector == zone->wp &&
+			    end_sector >= zone->start + zone->len) {
+				zone->cond = BLK_ZONE_COND_FULL;
+				zone->wp = zone->start + zone->len;
+			} else if (end_sector > zone->wp) {
+				zone->wp = end_sector;
+			}
+
+		}
+
+	} else {
+
+		/* Check read after write pointer */
+		if (sector + sectors <= zone->wp)
+			goto out;
+
+		if (zone->wp <= sector) {
+			/* Read beyond WP: clear request buffer */
+			struct req_iterator iter;
+			struct bio_vec bvec;
+			unsigned long flags;
+			void *buf;
+			rq_for_each_segment(bvec, rq, iter) {
+				buf = bvec_kmap_irq(&bvec, &flags);
+				memset(buf, 0, bvec.bv_len);
+				flush_dcache_page(bvec.bv_page);
+				bvec_kunmap_irq(buf, &flags);
+			}
+			ret = BLKPREP_DONE;
+			goto out;
+		}
+
+		/* Read straddle WP position: limit request size */
+		*num_sectors = zone->wp - sector;
+
+	}
+
+out:
+	blk_unlock_zone(zone);
+
+	return ret;
+}
+
+void sd_zbc_done(struct scsi_cmnd *cmd,
+		 struct scsi_sense_hdr *sshdr)
+{
+	int result = cmd->result;
+	struct request *rq = cmd->request;
+	struct scsi_disk *sdkp = scsi_disk(rq->rq_disk);
+	struct request_queue *q = sdkp->disk->queue;
+	sector_t pos = blk_rq_pos(rq);
+	struct blk_zone *zone = NULL;
+	bool write_unlock = false;
+
+	/*
+	 * Get the target zone of commands of interest. Some may
+	 * apply to all zones so check the request sectors first.
+	 */
+	switch (req_op(rq)) {
+	case REQ_OP_DISCARD:
+	case REQ_OP_WRITE:
+	case REQ_OP_WRITE_SAME:
+	case REQ_OP_ZONE_RESET:
+		write_unlock = true;
+		/* fallthru */
+	case REQ_OP_ZONE_OPEN:
+	case REQ_OP_ZONE_CLOSE:
+	case REQ_OP_ZONE_FINISH:
+		if (blk_rq_sectors(rq))
+			zone = blk_lookup_zone(q, pos);
+		break;
+	}
+
+	if (zone && write_unlock)
+	    blk_write_unlock_zone(zone);
+
+	if (!result)
+		return;
+
+	if (sshdr->sense_key == ILLEGAL_REQUEST &&
+	    sshdr->asc == 0x21)
+		/*
+		 * It is unlikely that retrying requests failed with any
+		 * kind of alignement error will result in success. So don't
+		 * try. Report the error back to the user quickly so that
+		 * corrective actions can be taken after obtaining updated
+		 * zone information.
+		 */
+		cmd->allowed = 0;
+
+	/* On error, force an update unless this is a failed report */
+	if (req_op(rq) == REQ_OP_ZONE_REPORT)
+		sd_zbc_clear_zones_updating(sdkp, pos, blk_rq_sectors(rq));
+	else if (zone)
+		sd_zbc_update_zones(sdkp, zone->start, zone->len,
+				    GFP_ATOMIC, false);
+}
+
+void sd_zbc_read_zones(struct scsi_disk *sdkp, char *buf)
+{
+	struct request_queue *q = sdkp->disk->queue;
+	struct blk_zone *zone;
+	sector_t capacity;
+	sector_t sector;
+	bool init = false;
+	u32 rep_len;
+	int ret = 0;
+
+	if (sdkp->zoned != 1 && sdkp->device->type != TYPE_ZBC)
+		/*
+		 * Device managed or normal SCSI disk,
+		 * no special handling required
+		 */
+		return;
+
+	/* Do a report zone to get the maximum LBA to check capacity */
+	ret = sd_zbc_report_zones(sdkp, buf, SD_BUF_SIZE,
+				  0, ZBC_ZONE_REPORTING_OPTION_ALL, false);
+	if (ret < 0)
+		return;
+
+	rep_len = get_unaligned_be32(&buf[0]);
+	if (rep_len < 64) {
+		sd_printk(KERN_WARNING, sdkp,
+			  "REPORT ZONES report invalid length %u\n",
+			  rep_len);
+		return;
+	}
+
+	if (sdkp->rc_basis == 0) {
+		/* The max_lba field is the capacity of this device */
+		sector_t lba = get_unaligned_be64(&buf[8]);
+		if (lba + 1 > sdkp->capacity) {
+			if (sdkp->first_scan)
+				sd_printk(KERN_WARNING, sdkp,
+					  "Changing capacity from %zu "
+					  "to max LBA+1 %zu\n",
+					  sdkp->capacity,
+					  (sector_t) lba + 1);
+			sdkp->capacity = lba + 1;
+		}
+	}
+
+	/* Setup the zone work queue */
+	if (! sdkp->zone_work_q) {
+		sdkp->zone_work_q =
+			alloc_ordered_workqueue("zbc_wq_%s", WQ_MEM_RECLAIM,
+						sdkp->disk->disk_name);
+		if (!sdkp->zone_work_q) {
+			sdev_printk(KERN_WARNING, sdkp->device,
+				    "Create zoned disk workqueue failed\n");
+			return;
+		}
+		init = true;
+	}
+
+	/*
+	 * Parse what we already got. If all zones are not parsed yet,
+	 * kick start an update to get the remaining.
+	 */
+	capacity = logical_to_sectors(sdkp->device, sdkp->capacity);
+	ret = zbc_parse_zones(sdkp, buf, SD_BUF_SIZE, &sector);
+	if (ret == 0 && sector < capacity) {
+		sd_zbc_update_zones(sdkp, sector, capacity - sector,
+				    GFP_KERNEL, init);
+		drain_workqueue(sdkp->zone_work_q);
+	}
+	if (ret)
+		return;
+
+	/*
+	 * Analyze the zones layout: if all zones are the same size and
+	 * the size is a power of 2, chunk the device and map discard to
+	 * reset write pointer command. Otherwise, disable discard.
+	 */
+	sdkp->zone_sectors = 0;
+	sdkp->nr_zones = 0;
+	sector = 0;
+	while(sector < capacity) {
+
+		zone = blk_lookup_zone(q, sector);
+		if (!zone) {
+			sdkp->zone_sectors = 0;
+			sdkp->nr_zones = 0;
+			break;
+		}
+
+		sector += zone->len;
+
+		if (sdkp->zone_sectors == 0) {
+			sdkp->zone_sectors = zone->len;
+		} else if (sector != capacity &&
+			 zone->len != sdkp->zone_sectors) {
+			sdkp->zone_sectors = 0;
+			sdkp->nr_zones = 0;
+			break;
+		}
+
+		sdkp->nr_zones++;
+
+	}
+
+	if (!sdkp->zone_sectors ||
+	    !is_power_of_2(sdkp->zone_sectors)) {
+		sd_config_discard(sdkp, SD_LBP_DISABLE);
+		if (sdkp->first_scan)
+			sd_printk(KERN_NOTICE, sdkp,
+				  "%u zones (non constant zone size)\n",
+				  sdkp->nr_zones);
+		return;
+	}
+
+	/* Setup discard granularity to the zone size */
+	blk_queue_chunk_sectors(sdkp->disk->queue, sdkp->zone_sectors);
+	sdkp->max_unmap_blocks = sdkp->zone_sectors;
+	sdkp->unmap_alignment = sectors_to_logical(sdkp->device,
+						   sdkp->zone_sectors);
+	sdkp->unmap_granularity = sdkp->unmap_alignment;
+	sd_config_discard(sdkp, SD_ZBC_RESET_WP);
+
+	if (sdkp->first_scan) {
+		if (sdkp->nr_zones * sdkp->zone_sectors == capacity)
+			sd_printk(KERN_NOTICE, sdkp,
+				  "%u zones of %zu sectors\n",
+				  sdkp->nr_zones,
+				  sdkp->zone_sectors);
+		else
+			sd_printk(KERN_NOTICE, sdkp,
+				  "%u zones of %zu sectors "
+				  "+ 1 runt zone\n",
+				  sdkp->nr_zones - 1,
+				  sdkp->zone_sectors);
+	}
+}
+
+void sd_zbc_remove(struct scsi_disk *sdkp)
+{
+
+	sd_config_discard(sdkp, SD_LBP_DISABLE);
+
+	if (sdkp->zone_work_q) {
+		drain_workqueue(sdkp->zone_work_q);
+		destroy_workqueue(sdkp->zone_work_q);
+		sdkp->zone_work_q = NULL;
+		blk_drop_zones(sdkp->disk->queue);
+	}
+}
+
diff --git a/include/scsi/scsi_proto.h b/include/scsi/scsi_proto.h
index d1defd1..6ba66e0 100644
--- a/include/scsi/scsi_proto.h
+++ b/include/scsi/scsi_proto.h
@@ -299,4 +299,21 @@ struct scsi_lun {
 #define SCSI_ACCESS_STATE_MASK        0x0f
 #define SCSI_ACCESS_STATE_PREFERRED   0x80
 
+/* Reporting options for REPORT ZONES */
+enum zbc_zone_reporting_options {
+	ZBC_ZONE_REPORTING_OPTION_ALL = 0,
+	ZBC_ZONE_REPORTING_OPTION_EMPTY,
+	ZBC_ZONE_REPORTING_OPTION_IMPLICIT_OPEN,
+	ZBC_ZONE_REPORTING_OPTION_EXPLICIT_OPEN,
+	ZBC_ZONE_REPORTING_OPTION_CLOSED,
+	ZBC_ZONE_REPORTING_OPTION_FULL,
+	ZBC_ZONE_REPORTING_OPTION_READONLY,
+	ZBC_ZONE_REPORTING_OPTION_OFFLINE,
+	ZBC_ZONE_REPORTING_OPTION_NEED_RESET_WP = 0x10,
+	ZBC_ZONE_REPORTING_OPTION_NON_SEQWRITE,
+	ZBC_ZONE_REPORTING_OPTION_NON_WP = 0x3f,
+};
+
+#define ZBC_REPORT_ZONE_PARTIAL 0x80
+
 #endif /* _SCSI_PROTO_H_ */
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 9/9] blk-zoned: Add ioctl interface for zone operations
  2016-09-19 21:27 ` Damien Le Moal
@ 2016-09-19 21:27   ` Damien Le Moal
  -1 siblings, 0 replies; 36+ messages in thread
From: Damien Le Moal @ 2016-09-19 21:27 UTC (permalink / raw)
  To: linux-scsi, linux-block
  Cc: martin.petersen, axboe, hare, shaun.tancheff, Damien Le Moal

From: Shaun Tancheff <shaun.tancheff@seagate.com>

Adds the new BLKUPDATEZONES, BLKREPORTZONE, BLKRESETZONE,
BLKOPENZONE, BLKCLOSEZONE and BLKFINISHZONE ioctls.

BLKREPORTZONE implementation uses the device queue zone RB-tree by
default and no actual command is issued to the device. If the
application needs access to the untracked zone attributes (non-seq
flag or reset recommended flag, offline or read-only zone condition,
etc), BLKUPDATEZONES must be issued first to force an update of the
cached zone information.

Changelog (Damien):
* Simplified blkzone descriptor (removed bit-fields and use CPU
  endianness)
* Changed report ioctl to operate on single zone instead of an
  array of blkzone structures.

Signed-off-by: Shaun Tancheff <shaun.tancheff@seagate.com>
Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
---
 block/blk-zoned.c             | 115 ++++++++++++++++++++++++++++++++++++++++++
 block/ioctl.c                 |   8 +++
 include/linux/blkdev.h        |   7 +++
 include/uapi/linux/Kbuild     |   1 +
 include/uapi/linux/blkzoned.h |  91 +++++++++++++++++++++++++++++++++
 include/uapi/linux/fs.h       |   1 +
 6 files changed, 223 insertions(+)
 create mode 100644 include/uapi/linux/blkzoned.h

diff --git a/block/blk-zoned.c b/block/blk-zoned.c
index a107940..71205c8 100644
--- a/block/blk-zoned.c
+++ b/block/blk-zoned.c
@@ -12,6 +12,7 @@
 #include <linux/module.h>
 #include <linux/rbtree.h>
 #include <linux/blkdev.h>
+#include <linux/blkzoned.h>
 
 void blk_init_zones(struct request_queue *q)
 {
@@ -336,3 +337,117 @@ int blkdev_finish_zone(struct block_device *bdev,
 	return blkdev_issue_zone_action(bdev, sector, REQ_OP_ZONE_FINISH,
 					gfp_mask);
 }
+
+static int blkdev_report_zone_ioctl(struct block_device *bdev,
+				    void __user *argp)
+{
+	struct blk_zone *zone;
+	struct blkzone z;
+
+	if (copy_from_user(&z, argp, sizeof(struct blkzone)))
+		return -EFAULT;
+
+	zone = blk_lookup_zone(bdev_get_queue(bdev), z.start);
+	if (!zone)
+		return -EINVAL;
+
+	memset(&z, 0, sizeof(struct blkzone));
+
+	blk_lock_zone(zone);
+
+	blk_wait_for_zone_update(zone);
+
+	z.len = zone->len;
+	z.start = zone->start;
+	z.wp = zone->wp;
+	z.type = zone->type;
+	z.cond = zone->cond;
+	z.non_seq = zone->non_seq;
+	z.reset = zone->reset;
+
+	blk_unlock_zone(zone);
+
+	if (copy_to_user(argp, &z, sizeof(struct blkzone)))
+		return -EFAULT;
+
+	return 0;
+}
+
+static int blkdev_zone_action_ioctl(struct block_device *bdev,
+				    unsigned cmd, void __user *argp)
+{
+	unsigned int op;
+	u64 sector;
+
+	if (get_user(sector, (u64 __user *)argp))
+		return -EFAULT;
+
+	switch (cmd) {
+	case BLKRESETZONE:
+		op = REQ_OP_ZONE_RESET;
+		break;
+	case BLKOPENZONE:
+		op = REQ_OP_ZONE_OPEN;
+		break;
+	case BLKCLOSEZONE:
+		op = REQ_OP_ZONE_CLOSE;
+		break;
+	case BLKFINISHZONE:
+		op = REQ_OP_ZONE_FINISH;
+		break;
+	}
+
+	return blkdev_issue_zone_action(bdev, sector, op, GFP_KERNEL);
+}
+
+/**
+ * Called from blkdev_ioctl.
+ */
+int blkdev_zone_ioctl(struct block_device *bdev, fmode_t mode,
+		      unsigned cmd, unsigned long arg)
+{
+	void __user *argp = (void __user *)arg;
+	struct request_queue *q;
+	int ret;
+
+	if (!argp)
+		return -EINVAL;
+
+	q = bdev_get_queue(bdev);
+	if (!q)
+		return -ENXIO;
+
+	if (!blk_queue_zoned(q))
+		return -ENOTTY;
+
+	if (!capable(CAP_SYS_ADMIN))
+		return -EACCES;
+
+	switch (cmd) {
+	case BLKREPORTZONE:
+		ret = blkdev_report_zone_ioctl(bdev, argp);
+		break;
+	case BLKUPDATEZONES:
+		if (!(mode & FMODE_WRITE)) {
+			ret = -EBADF;
+			break;
+		}
+		ret = blkdev_update_zones(bdev, GFP_KERNEL);
+		break;
+	case BLKRESETZONE:
+	case BLKOPENZONE:
+	case BLKCLOSEZONE:
+	case BLKFINISHZONE:
+		if (!(mode & FMODE_WRITE)) {
+			ret = -EBADF;
+			break;
+		}
+		ret = blkdev_zone_action_ioctl(bdev, cmd, argp);
+		break;
+	default:
+		ret = -ENOTTY;
+		break;
+	}
+
+	return ret;
+}
diff --git a/block/ioctl.c b/block/ioctl.c
index ed2397f..f09679a 100644
--- a/block/ioctl.c
+++ b/block/ioctl.c
@@ -3,6 +3,7 @@
 #include <linux/export.h>
 #include <linux/gfp.h>
 #include <linux/blkpg.h>
+#include <linux/blkzoned.h>
 #include <linux/hdreg.h>
 #include <linux/backing-dev.h>
 #include <linux/fs.h>
@@ -513,6 +514,13 @@ int blkdev_ioctl(struct block_device *bdev, fmode_t mode, unsigned cmd,
 				BLKDEV_DISCARD_SECURE);
 	case BLKZEROOUT:
 		return blk_ioctl_zeroout(bdev, mode, arg);
+	case BLKUPDATEZONES:
+	case BLKREPORTZONE:
+	case BLKRESETZONE:
+	case BLKOPENZONE:
+	case BLKCLOSEZONE:
+	case BLKFINISHZONE:
+		return blkdev_zone_ioctl(bdev, mode, cmd, arg);
 	case HDIO_GETGEO:
 		return blkdev_getgeo(bdev, argp);
 	case BLKRAGET:
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index a85f95b..0299d41 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -405,9 +405,16 @@ extern int blkdev_reset_zone(struct block_device *, sector_t, gfp_t);
 extern int blkdev_open_zone(struct block_device *, sector_t, gfp_t);
 extern int blkdev_close_zone(struct block_device *, sector_t, gfp_t);
 extern int blkdev_finish_zone(struct block_device *, sector_t, gfp_t);
+extern int blkdev_zone_ioctl(struct block_device *, fmode_t, unsigned int,
+			     unsigned long);
 #else /* CONFIG_BLK_DEV_ZONED */
 static inline void blk_init_zones(struct request_queue *q) { };
 static inline void blk_drop_zones(struct request_queue *q) { };
+static inline int blkdev_zone_ioctl(struct block_device *bdev, fmode_t mode,
+				    unsigned cmd, unsigned long arg)
+{
+	return -ENOTTY;
+}
 #endif /* CONFIG_BLK_DEV_ZONED */
 
 struct request_queue {
diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild
index 185f8ea..a2a7522 100644
--- a/include/uapi/linux/Kbuild
+++ b/include/uapi/linux/Kbuild
@@ -70,6 +70,7 @@ header-y += bfs_fs.h
 header-y += binfmts.h
 header-y += blkpg.h
 header-y += blktrace_api.h
+header-y += blkzoned.h
 header-y += bpf_common.h
 header-y += bpf.h
 header-y += bpqether.h
diff --git a/include/uapi/linux/blkzoned.h b/include/uapi/linux/blkzoned.h
new file mode 100644
index 0000000..23a2702
--- /dev/null
+++ b/include/uapi/linux/blkzoned.h
@@ -0,0 +1,91 @@
+/*
+ * Zoned block devices handling.
+ *
+ * Copyright (C) 2015 Seagate Technology PLC
+ *
+ * Written by: Shaun Tancheff <shaun.tancheff@seagate.com>
+ *
+ * Modified by: Damien Le Moal <damien.lemoal@hgst.com>
+ * Copyright (C) 2016 Western Digital
+ *
+ * This file is licensed under  the terms of the GNU General Public
+ * License version 2. This program is licensed "as is" without any
+ * warranty of any kind, whether express or implied.
+ */
+#ifndef _UAPI_BLKZONED_H
+#define _UAPI_BLKZONED_H
+
+#include <linux/types.h>
+#include <linux/ioctl.h>
+
+/*
+ * Zone type.
+ */
+enum blkzone_type {
+	BLKZONE_TYPE_UNKNOWN,
+	BLKZONE_TYPE_CONVENTIONAL,
+	BLKZONE_TYPE_SEQWRITE_REQ,
+	BLKZONE_TYPE_SEQWRITE_PREF,
+};
+
+/*
+ * Zone condition.
+ */
+enum blkzone_cond {
+	BLKZONE_COND_NO_WP,
+	BLKZONE_COND_EMPTY,
+	BLKZONE_COND_IMP_OPEN,
+	BLKZONE_COND_EXP_OPEN,
+	BLKZONE_COND_CLOSED,
+	BLKZONE_COND_READONLY = 0xd,
+	BLKZONE_COND_FULL,
+	BLKZONE_COND_OFFLINE,
+};
+
+/*
+ * Zone descriptor for BLKREPORTZONE.
+ * start, len and wp use the regulare 512 B sector unit,
+ * regardless of the device logical block size. The overall
+ * structure size is 64 B to match the ZBC/ZAC defined zone descriptor
+ * and allow support for future additional zone information.
+ */
+struct blkzone {
+       __u64 	start;	 	/* Zone start sector */
+       __u64 	len;	 	/* Zone length in number of sectors */
+       __u64 	wp;	 	/* Zone write pointer position */
+       __u8	type;		/* Zone type */
+       __u8	cond;		/* Zone condition */
+       __u8	non_seq;	/* Non-sequential write resources active */
+       __u8	reset;		/* Reset write pointer recommended */
+       __u8 	reserved[36];
+};
+
+/*
+ * Zone ioctl's:
+ *
+ * BLKUPDATEZONES	: Force update of all zones information
+ * BLKREPORTZONE	: Get a zone descriptor. Takes a zone descriptor as
+ *                        argument. The zone to report is the one
+ *                        containing the sector initially specified in the
+ *                        descriptor start field.
+ * BLKRESETZONE		: Reset the write pointer of the zone containing the
+ *                        specified sector, or of all written zones if the
+ *                        sector is ~0ull.
+ * BLKOPENZONE		: Explicitely open the zone containing the
+ *                        specified sector, or all possible zones if the
+ *                        sector is ~0ull (the drive determines which zone
+ *                        to open in this case).
+ * BLKCLOSEZONE		: Close the zone containing the specified sector, or
+ *                        all open zones if the sector is ~0ull.
+ * BLKFINISHZONE	: Finish the zone (make it full) containing the
+ *                        specified sector, or all open and closed zones if
+ *                        the sector is ~0ull.
+ */
+#define BLKUPDATEZONES	_IO(0x12,130)
+#define BLKREPORTZONE 	_IOWR(0x12,131,struct blkzone)
+#define BLKRESETZONE 	_IOW(0x12,132,unsigned long long)
+#define BLKOPENZONE 	_IOW(0x12,133,unsigned long long)
+#define BLKCLOSEZONE 	_IOW(0x12,134,unsigned long long)
+#define BLKFINISHZONE 	_IOW(0x12,135,unsigned long long)
+
+#endif /* _UAPI_BLKZONED_H */
diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
index 3b00f7c..1db6d66 100644
--- a/include/uapi/linux/fs.h
+++ b/include/uapi/linux/fs.h
@@ -222,6 +222,7 @@ struct fsxattr {
 #define BLKSECDISCARD _IO(0x12,125)
 #define BLKROTATIONAL _IO(0x12,126)
 #define BLKZEROOUT _IO(0x12,127)
+/* A jump here: 130-135 are used for zoned block devices (see uapi/linux/blkzoned.h) */
 
 #define BMAP_IOCTL 1		/* obsolete - kept for compatibility */
 #define FIBMAP	   _IO(0x00,1)	/* bmap access */
-- 
2.7.4

Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice & Disclaimer:

This e-mail and any files transmitted with it may contain confidential or legally privileged information of WDC and/or its affiliates, and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system.

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 9/9] blk-zoned: Add ioctl interface for zone operations
@ 2016-09-19 21:27   ` Damien Le Moal
  0 siblings, 0 replies; 36+ messages in thread
From: Damien Le Moal @ 2016-09-19 21:27 UTC (permalink / raw)
  To: linux-scsi, linux-block
  Cc: martin.petersen, axboe, hare, shaun.tancheff, Damien Le Moal

From: Shaun Tancheff <shaun.tancheff@seagate.com>

Adds the new BLKUPDATEZONES, BLKREPORTZONE, BLKRESETZONE,
BLKOPENZONE, BLKCLOSEZONE and BLKFINISHZONE ioctls.

BLKREPORTZONE implementation uses the device queue zone RB-tree by
default and no actual command is issued to the device. If the
application needs access to the untracked zone attributes (non-seq
flag or reset recommended flag, offline or read-only zone condition,
etc), BLKUPDATEZONES must be issued first to force an update of the
cached zone information.

Changelog (Damien):
* Simplified blkzone descriptor (removed bit-fields and use CPU
  endianness)
* Changed report ioctl to operate on single zone instead of an
  array of blkzone structures.

Signed-off-by: Shaun Tancheff <shaun.tancheff@seagate.com>
Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
---
 block/blk-zoned.c             | 115 ++++++++++++++++++++++++++++++++++++++++++
 block/ioctl.c                 |   8 +++
 include/linux/blkdev.h        |   7 +++
 include/uapi/linux/Kbuild     |   1 +
 include/uapi/linux/blkzoned.h |  91 +++++++++++++++++++++++++++++++++
 include/uapi/linux/fs.h       |   1 +
 6 files changed, 223 insertions(+)
 create mode 100644 include/uapi/linux/blkzoned.h

diff --git a/block/blk-zoned.c b/block/blk-zoned.c
index a107940..71205c8 100644
--- a/block/blk-zoned.c
+++ b/block/blk-zoned.c
@@ -12,6 +12,7 @@
 #include <linux/module.h>
 #include <linux/rbtree.h>
 #include <linux/blkdev.h>
+#include <linux/blkzoned.h>
 
 void blk_init_zones(struct request_queue *q)
 {
@@ -336,3 +337,117 @@ int blkdev_finish_zone(struct block_device *bdev,
 	return blkdev_issue_zone_action(bdev, sector, REQ_OP_ZONE_FINISH,
 					gfp_mask);
 }
+
+static int blkdev_report_zone_ioctl(struct block_device *bdev,
+				    void __user *argp)
+{
+	struct blk_zone *zone;
+	struct blkzone z;
+
+	if (copy_from_user(&z, argp, sizeof(struct blkzone)))
+		return -EFAULT;
+
+	zone = blk_lookup_zone(bdev_get_queue(bdev), z.start);
+	if (!zone)
+		return -EINVAL;
+
+	memset(&z, 0, sizeof(struct blkzone));
+
+	blk_lock_zone(zone);
+
+	blk_wait_for_zone_update(zone);
+
+	z.len = zone->len;
+	z.start = zone->start;
+	z.wp = zone->wp;
+	z.type = zone->type;
+	z.cond = zone->cond;
+	z.non_seq = zone->non_seq;
+	z.reset = zone->reset;
+
+	blk_unlock_zone(zone);
+
+	if (copy_to_user(argp, &z, sizeof(struct blkzone)))
+		return -EFAULT;
+
+	return 0;
+}
+
+static int blkdev_zone_action_ioctl(struct block_device *bdev,
+				    unsigned cmd, void __user *argp)
+{
+	unsigned int op;
+	u64 sector;
+
+	if (get_user(sector, (u64 __user *)argp))
+		return -EFAULT;
+
+	switch (cmd) {
+	case BLKRESETZONE:
+		op = REQ_OP_ZONE_RESET;
+		break;
+	case BLKOPENZONE:
+		op = REQ_OP_ZONE_OPEN;
+		break;
+	case BLKCLOSEZONE:
+		op = REQ_OP_ZONE_CLOSE;
+		break;
+	case BLKFINISHZONE:
+		op = REQ_OP_ZONE_FINISH;
+		break;
+	}
+
+	return blkdev_issue_zone_action(bdev, sector, op, GFP_KERNEL);
+}
+
+/**
+ * Called from blkdev_ioctl.
+ */
+int blkdev_zone_ioctl(struct block_device *bdev, fmode_t mode,
+		      unsigned cmd, unsigned long arg)
+{
+	void __user *argp = (void __user *)arg;
+	struct request_queue *q;
+	int ret;
+
+	if (!argp)
+		return -EINVAL;
+
+	q = bdev_get_queue(bdev);
+	if (!q)
+		return -ENXIO;
+
+	if (!blk_queue_zoned(q))
+		return -ENOTTY;
+
+	if (!capable(CAP_SYS_ADMIN))
+		return -EACCES;
+
+	switch (cmd) {
+	case BLKREPORTZONE:
+		ret = blkdev_report_zone_ioctl(bdev, argp);
+		break;
+	case BLKUPDATEZONES:
+		if (!(mode & FMODE_WRITE)) {
+			ret = -EBADF;
+			break;
+		}
+		ret = blkdev_update_zones(bdev, GFP_KERNEL);
+		break;
+	case BLKRESETZONE:
+	case BLKOPENZONE:
+	case BLKCLOSEZONE:
+	case BLKFINISHZONE:
+		if (!(mode & FMODE_WRITE)) {
+			ret = -EBADF;
+			break;
+		}
+		ret = blkdev_zone_action_ioctl(bdev, cmd, argp);
+		break;
+	default:
+		ret = -ENOTTY;
+		break;
+	}
+
+	return ret;
+}
diff --git a/block/ioctl.c b/block/ioctl.c
index ed2397f..f09679a 100644
--- a/block/ioctl.c
+++ b/block/ioctl.c
@@ -3,6 +3,7 @@
 #include <linux/export.h>
 #include <linux/gfp.h>
 #include <linux/blkpg.h>
+#include <linux/blkzoned.h>
 #include <linux/hdreg.h>
 #include <linux/backing-dev.h>
 #include <linux/fs.h>
@@ -513,6 +514,13 @@ int blkdev_ioctl(struct block_device *bdev, fmode_t mode, unsigned cmd,
 				BLKDEV_DISCARD_SECURE);
 	case BLKZEROOUT:
 		return blk_ioctl_zeroout(bdev, mode, arg);
+	case BLKUPDATEZONES:
+	case BLKREPORTZONE:
+	case BLKRESETZONE:
+	case BLKOPENZONE:
+	case BLKCLOSEZONE:
+	case BLKFINISHZONE:
+		return blkdev_zone_ioctl(bdev, mode, cmd, arg);
 	case HDIO_GETGEO:
 		return blkdev_getgeo(bdev, argp);
 	case BLKRAGET:
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index a85f95b..0299d41 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -405,9 +405,16 @@ extern int blkdev_reset_zone(struct block_device *, sector_t, gfp_t);
 extern int blkdev_open_zone(struct block_device *, sector_t, gfp_t);
 extern int blkdev_close_zone(struct block_device *, sector_t, gfp_t);
 extern int blkdev_finish_zone(struct block_device *, sector_t, gfp_t);
+extern int blkdev_zone_ioctl(struct block_device *, fmode_t, unsigned int,
+			     unsigned long);
 #else /* CONFIG_BLK_DEV_ZONED */
 static inline void blk_init_zones(struct request_queue *q) { };
 static inline void blk_drop_zones(struct request_queue *q) { };
+static inline int blkdev_zone_ioctl(struct block_device *bdev, fmode_t mode,
+				    unsigned cmd, unsigned long arg)
+{
+	return -ENOTTY;
+}
 #endif /* CONFIG_BLK_DEV_ZONED */
 
 struct request_queue {
diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild
index 185f8ea..a2a7522 100644
--- a/include/uapi/linux/Kbuild
+++ b/include/uapi/linux/Kbuild
@@ -70,6 +70,7 @@ header-y += bfs_fs.h
 header-y += binfmts.h
 header-y += blkpg.h
 header-y += blktrace_api.h
+header-y += blkzoned.h
 header-y += bpf_common.h
 header-y += bpf.h
 header-y += bpqether.h
diff --git a/include/uapi/linux/blkzoned.h b/include/uapi/linux/blkzoned.h
new file mode 100644
index 0000000..23a2702
--- /dev/null
+++ b/include/uapi/linux/blkzoned.h
@@ -0,0 +1,91 @@
+/*
+ * Zoned block devices handling.
+ *
+ * Copyright (C) 2015 Seagate Technology PLC
+ *
+ * Written by: Shaun Tancheff <shaun.tancheff@seagate.com>
+ *
+ * Modified by: Damien Le Moal <damien.lemoal@hgst.com>
+ * Copyright (C) 2016 Western Digital
+ *
+ * This file is licensed under  the terms of the GNU General Public
+ * License version 2. This program is licensed "as is" without any
+ * warranty of any kind, whether express or implied.
+ */
+#ifndef _UAPI_BLKZONED_H
+#define _UAPI_BLKZONED_H
+
+#include <linux/types.h>
+#include <linux/ioctl.h>
+
+/*
+ * Zone type.
+ */
+enum blkzone_type {
+	BLKZONE_TYPE_UNKNOWN,
+	BLKZONE_TYPE_CONVENTIONAL,
+	BLKZONE_TYPE_SEQWRITE_REQ,
+	BLKZONE_TYPE_SEQWRITE_PREF,
+};
+
+/*
+ * Zone condition.
+ */
+enum blkzone_cond {
+	BLKZONE_COND_NO_WP,
+	BLKZONE_COND_EMPTY,
+	BLKZONE_COND_IMP_OPEN,
+	BLKZONE_COND_EXP_OPEN,
+	BLKZONE_COND_CLOSED,
+	BLKZONE_COND_READONLY = 0xd,
+	BLKZONE_COND_FULL,
+	BLKZONE_COND_OFFLINE,
+};
+
+/*
+ * Zone descriptor for BLKREPORTZONE.
+ * start, len and wp use the regulare 512 B sector unit,
+ * regardless of the device logical block size. The overall
+ * structure size is 64 B to match the ZBC/ZAC defined zone descriptor
+ * and allow support for future additional zone information.
+ */
+struct blkzone {
+       __u64 	start;	 	/* Zone start sector */
+       __u64 	len;	 	/* Zone length in number of sectors */
+       __u64 	wp;	 	/* Zone write pointer position */
+       __u8	type;		/* Zone type */
+       __u8	cond;		/* Zone condition */
+       __u8	non_seq;	/* Non-sequential write resources active */
+       __u8	reset;		/* Reset write pointer recommended */
+       __u8 	reserved[36];
+};
+
+/*
+ * Zone ioctl's:
+ *
+ * BLKUPDATEZONES	: Force update of all zones information
+ * BLKREPORTZONE	: Get a zone descriptor. Takes a zone descriptor as
+ *                        argument. The zone to report is the one
+ *                        containing the sector initially specified in the
+ *                        descriptor start field.
+ * BLKRESETZONE		: Reset the write pointer of the zone containing the
+ *                        specified sector, or of all written zones if the
+ *                        sector is ~0ull.
+ * BLKOPENZONE		: Explicitely open the zone containing the
+ *                        specified sector, or all possible zones if the
+ *                        sector is ~0ull (the drive determines which zone
+ *                        to open in this case).
+ * BLKCLOSEZONE		: Close the zone containing the specified sector, or
+ *                        all open zones if the sector is ~0ull.
+ * BLKFINISHZONE	: Finish the zone (make it full) containing the
+ *                        specified sector, or all open and closed zones if
+ *                        the sector is ~0ull.
+ */
+#define BLKUPDATEZONES	_IO(0x12,130)
+#define BLKREPORTZONE 	_IOWR(0x12,131,struct blkzone)
+#define BLKRESETZONE 	_IOW(0x12,132,unsigned long long)
+#define BLKOPENZONE 	_IOW(0x12,133,unsigned long long)
+#define BLKCLOSEZONE 	_IOW(0x12,134,unsigned long long)
+#define BLKFINISHZONE 	_IOW(0x12,135,unsigned long long)
+
+#endif /* _UAPI_BLKZONED_H */
diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
index 3b00f7c..1db6d66 100644
--- a/include/uapi/linux/fs.h
+++ b/include/uapi/linux/fs.h
@@ -222,6 +222,7 @@ struct fsxattr {
 #define BLKSECDISCARD _IO(0x12,125)
 #define BLKROTATIONAL _IO(0x12,126)
 #define BLKZEROOUT _IO(0x12,127)
+/* A jump here: 130-135 are used for zoned block devices (see uapi/linux/blkzoned.h) */
 
 #define BMAP_IOCTL 1		/* obsolete - kept for compatibility */
 #define FIBMAP	   _IO(0x00,1)	/* bmap access */
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH 8/9] sd: Implement support for ZBC devices
  2016-09-19 21:27   ` Damien Le Moal
@ 2016-09-20  0:08     ` kbuild test robot
  -1 siblings, 0 replies; 36+ messages in thread
From: kbuild test robot @ 2016-09-20  0:08 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: kbuild-all, linux-scsi, linux-block, martin.petersen, axboe,
	hare, shaun.tancheff, Hannes Reinecke, Damien Le Moal

[-- Attachment #1: Type: text/plain, Size: 30989 bytes --]

Hi Hannes,

[auto build test WARNING on linus/master]
[also build test WARNING on v4.8-rc7]
[cannot apply to block/for-next next-20160919]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
[Suggest to use git(>=2.9.0) format-patch --base=<commit> (or --base=auto for convenience) to record what (public, well-known) commit your patch series was built on]
[Check https://git-scm.com/docs/git-format-patch for more information]

url:    https://github.com/0day-ci/linux/commits/Damien-Le-Moal/ZBC-Zoned-block-device-support/20160920-062608
config: i386-allmodconfig (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

All warnings (new ones prefixed by >>):

   In file included from include/linux/kernel.h:13:0,
                    from include/linux/sched.h:17,
                    from include/linux/blkdev.h:4,
                    from drivers/scsi/sd_zbc.c:24:
   drivers/scsi/sd_zbc.c: In function 'sd_zbc_report_zones':
>> drivers/scsi/sd_zbc.c:61:11: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
     pr_debug("%s %s [%s]: " fmt,    \
              ^
   include/linux/printk.h:260:21: note: in definition of macro 'pr_fmt'
    #define pr_fmt(fmt) fmt
                        ^~~
   include/linux/printk.h:308:2: note: in expansion of macro 'dynamic_pr_debug'
     dynamic_pr_debug(fmt, ##__VA_ARGS__)
     ^~~~~~~~~~~~~~~~
>> drivers/scsi/sd_zbc.c:61:2: note: in expansion of macro 'pr_debug'
     pr_debug("%s %s [%s]: " fmt,    \
     ^~~~~~~~
>> drivers/scsi/sd_zbc.c:221:2: note: in expansion of macro 'sd_zbc_debug'
     sd_zbc_debug(sdkp, "REPORT ZONES lba %zu len %d\n",
     ^~~~~~~~~~~~
   In file included from include/linux/printk.h:6:0,
                    from include/linux/kernel.h:13,
                    from include/linux/sched.h:17,
                    from include/linux/blkdev.h:4,
                    from drivers/scsi/sd_zbc.c:24:
>> include/linux/kern_levels.h:4:18: warning: format '%zu' expects argument of type 'size_t', but argument 5 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
    #define KERN_SOH "\001"  /* ASCII Start Of Header */
                     ^
   include/linux/kern_levels.h:10:18: note: in expansion of macro 'KERN_SOH'
    #define KERN_ERR KERN_SOH "3" /* error conditions */
                     ^~~~~~~~
   include/linux/printk.h:276:9: note: in expansion of macro 'KERN_ERR'
     printk(KERN_ERR pr_fmt(fmt), ##__VA_ARGS__)
            ^~~~~~~~
>> drivers/scsi/sd_zbc.c:73:2: note: in expansion of macro 'pr_err'
     pr_err("%s %s [%s]: " fmt,    \
     ^~~~~~
>> drivers/scsi/sd_zbc.c:237:3: note: in expansion of macro 'sd_zbc_err'
      sd_zbc_err(sdkp,
      ^~~~~~~~~~
   In file included from include/linux/kernel.h:13:0,
                    from include/linux/sched.h:17,
                    from include/linux/blkdev.h:4,
                    from drivers/scsi/sd_zbc.c:24:
   drivers/scsi/sd_zbc.c: In function 'sd_zbc_setup_reset_cmnd':
>> drivers/scsi/sd_zbc.c:61:11: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
     pr_debug("%s %s [%s]: " fmt,    \
              ^
   include/linux/printk.h:260:21: note: in definition of macro 'pr_fmt'
    #define pr_fmt(fmt) fmt
                        ^~~
   include/linux/printk.h:308:2: note: in expansion of macro 'dynamic_pr_debug'
     dynamic_pr_debug(fmt, ##__VA_ARGS__)
     ^~~~~~~~~~~~~~~~
>> drivers/scsi/sd_zbc.c:61:2: note: in expansion of macro 'pr_debug'
     pr_debug("%s %s [%s]: " fmt,    \
     ^~~~~~~~
   drivers/scsi/sd_zbc.c:489:4: note: in expansion of macro 'sd_zbc_debug'
       sd_zbc_debug(sdkp,
       ^~~~~~~~~~~~
   In file included from drivers/scsi/sd_zbc.c:37:0:
   drivers/scsi/sd_zbc.c:517:7: warning: format '%zu' expects argument of type 'size_t', but argument 5 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned reset wp request, start %zu/%zu"
          ^
   drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
         (sdsk)->disk->disk_name, fmt, ##a) : \
                                  ^~~
   drivers/scsi/sd_zbc.c:517:7: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned reset wp request, start %zu/%zu"
          ^
   drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
         (sdsk)->disk->disk_name, fmt, ##a) : \
                                  ^~~
   drivers/scsi/sd_zbc.c:517:7: warning: format '%zu' expects argument of type 'size_t', but argument 7 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned reset wp request, start %zu/%zu"
          ^
   drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
         (sdsk)->disk->disk_name, fmt, ##a) : \
                                  ^~~
   drivers/scsi/sd_zbc.c:517:7: warning: format '%zu' expects argument of type 'size_t', but argument 8 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned reset wp request, start %zu/%zu"
          ^
   drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
         (sdsk)->disk->disk_name, fmt, ##a) : \
                                  ^~~
   In file included from include/scsi/scsi_cmnd.h:10:0,
                    from drivers/scsi/sd_zbc.c:30:
   drivers/scsi/sd_zbc.c:517:7: warning: format '%zu' expects argument of type 'size_t', but argument 5 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned reset wp request, start %zu/%zu"
          ^
   include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
     sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
                                       ^~~
>> drivers/scsi/sd_zbc.c:516:4: note: in expansion of macro 'sd_printk'
       sd_printk(KERN_ERR, sdkp,
       ^~~~~~~~~
   drivers/scsi/sd_zbc.c:517:7: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned reset wp request, start %zu/%zu"
          ^
   include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
     sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
                                       ^~~
>> drivers/scsi/sd_zbc.c:516:4: note: in expansion of macro 'sd_printk'
       sd_printk(KERN_ERR, sdkp,
       ^~~~~~~~~
   drivers/scsi/sd_zbc.c:517:7: warning: format '%zu' expects argument of type 'size_t', but argument 7 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned reset wp request, start %zu/%zu"
          ^
   include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
     sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
                                       ^~~
>> drivers/scsi/sd_zbc.c:516:4: note: in expansion of macro 'sd_printk'
       sd_printk(KERN_ERR, sdkp,
       ^~~~~~~~~
   drivers/scsi/sd_zbc.c:517:7: warning: format '%zu' expects argument of type 'size_t', but argument 8 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned reset wp request, start %zu/%zu"
          ^
   include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
     sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
                                       ^~~
>> drivers/scsi/sd_zbc.c:516:4: note: in expansion of macro 'sd_printk'
       sd_printk(KERN_ERR, sdkp,
       ^~~~~~~~~
   In file included from include/linux/kernel.h:13:0,
                    from include/linux/sched.h:17,
                    from include/linux/blkdev.h:4,
                    from drivers/scsi/sd_zbc.c:24:
   drivers/scsi/sd_zbc.c: In function 'sd_zbc_setup_open_cmnd':
>> drivers/scsi/sd_zbc.c:61:11: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
     pr_debug("%s %s [%s]: " fmt,    \
              ^
   include/linux/printk.h:260:21: note: in definition of macro 'pr_fmt'
    #define pr_fmt(fmt) fmt
                        ^~~
   include/linux/printk.h:308:2: note: in expansion of macro 'dynamic_pr_debug'
     dynamic_pr_debug(fmt, ##__VA_ARGS__)
     ^~~~~~~~~~~~~~~~
>> drivers/scsi/sd_zbc.c:61:2: note: in expansion of macro 'pr_debug'
     pr_debug("%s %s [%s]: " fmt,    \
     ^~~~~~~~
   drivers/scsi/sd_zbc.c:573:4: note: in expansion of macro 'sd_zbc_debug'
       sd_zbc_debug(sdkp,
       ^~~~~~~~~~~~
   In file included from drivers/scsi/sd_zbc.c:37:0:
   drivers/scsi/sd_zbc.c:594:7: warning: format '%zu' expects argument of type 'size_t', but argument 5 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned open zone request, start %zu/%zu"
          ^
   drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
         (sdsk)->disk->disk_name, fmt, ##a) : \
                                  ^~~
   drivers/scsi/sd_zbc.c:594:7: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned open zone request, start %zu/%zu"
          ^
   drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
         (sdsk)->disk->disk_name, fmt, ##a) : \
                                  ^~~
   drivers/scsi/sd_zbc.c:594:7: warning: format '%zu' expects argument of type 'size_t', but argument 7 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned open zone request, start %zu/%zu"
          ^
   drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
         (sdsk)->disk->disk_name, fmt, ##a) : \
                                  ^~~
   drivers/scsi/sd_zbc.c:594:7: warning: format '%zu' expects argument of type 'size_t', but argument 8 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned open zone request, start %zu/%zu"
          ^
   drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
         (sdsk)->disk->disk_name, fmt, ##a) : \
                                  ^~~
   In file included from include/scsi/scsi_cmnd.h:10:0,
                    from drivers/scsi/sd_zbc.c:30:
   drivers/scsi/sd_zbc.c:594:7: warning: format '%zu' expects argument of type 'size_t', but argument 5 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned open zone request, start %zu/%zu"
          ^
   include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
     sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
                                       ^~~
   drivers/scsi/sd_zbc.c:593:4: note: in expansion of macro 'sd_printk'
       sd_printk(KERN_ERR, sdkp,
       ^~~~~~~~~
   drivers/scsi/sd_zbc.c:594:7: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned open zone request, start %zu/%zu"
          ^
   include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
     sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
                                       ^~~
   drivers/scsi/sd_zbc.c:593:4: note: in expansion of macro 'sd_printk'
       sd_printk(KERN_ERR, sdkp,
       ^~~~~~~~~
   drivers/scsi/sd_zbc.c:594:7: warning: format '%zu' expects argument of type 'size_t', but argument 7 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned open zone request, start %zu/%zu"
          ^
   include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
     sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
                                       ^~~
   drivers/scsi/sd_zbc.c:593:4: note: in expansion of macro 'sd_printk'
       sd_printk(KERN_ERR, sdkp,
       ^~~~~~~~~
   drivers/scsi/sd_zbc.c:594:7: warning: format '%zu' expects argument of type 'size_t', but argument 8 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned open zone request, start %zu/%zu"
          ^
   include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
     sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
                                       ^~~
   drivers/scsi/sd_zbc.c:593:4: note: in expansion of macro 'sd_printk'
       sd_printk(KERN_ERR, sdkp,
       ^~~~~~~~~
   In file included from include/linux/kernel.h:13:0,
                    from include/linux/sched.h:17,
                    from include/linux/blkdev.h:4,
                    from drivers/scsi/sd_zbc.c:24:
   drivers/scsi/sd_zbc.c: In function 'sd_zbc_setup_close_cmnd':
>> drivers/scsi/sd_zbc.c:61:11: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
     pr_debug("%s %s [%s]: " fmt,    \
              ^
   include/linux/printk.h:260:21: note: in definition of macro 'pr_fmt'
    #define pr_fmt(fmt) fmt
                        ^~~
   include/linux/printk.h:308:2: note: in expansion of macro 'dynamic_pr_debug'
     dynamic_pr_debug(fmt, ##__VA_ARGS__)
     ^~~~~~~~~~~~~~~~
>> drivers/scsi/sd_zbc.c:61:2: note: in expansion of macro 'pr_debug'
     pr_debug("%s %s [%s]: " fmt,    \
     ^~~~~~~~
   drivers/scsi/sd_zbc.c:645:4: note: in expansion of macro 'sd_zbc_debug'
       sd_zbc_debug(sdkp,
       ^~~~~~~~~~~~
   In file included from drivers/scsi/sd_zbc.c:37:0:
   drivers/scsi/sd_zbc.c:666:7: warning: format '%zu' expects argument of type 'size_t', but argument 5 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned close zone request, start %zu/%zu"
          ^
   drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
         (sdsk)->disk->disk_name, fmt, ##a) : \
                                  ^~~
   drivers/scsi/sd_zbc.c:666:7: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned close zone request, start %zu/%zu"
          ^
   drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
         (sdsk)->disk->disk_name, fmt, ##a) : \
                                  ^~~
   drivers/scsi/sd_zbc.c:666:7: warning: format '%zu' expects argument of type 'size_t', but argument 7 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned close zone request, start %zu/%zu"
          ^
   drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
         (sdsk)->disk->disk_name, fmt, ##a) : \
                                  ^~~
   drivers/scsi/sd_zbc.c:666:7: warning: format '%zu' expects argument of type 'size_t', but argument 8 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned close zone request, start %zu/%zu"
          ^
   drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
         (sdsk)->disk->disk_name, fmt, ##a) : \
                                  ^~~
   In file included from include/scsi/scsi_cmnd.h:10:0,
                    from drivers/scsi/sd_zbc.c:30:
   drivers/scsi/sd_zbc.c:666:7: warning: format '%zu' expects argument of type 'size_t', but argument 5 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned close zone request, start %zu/%zu"
          ^
   include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
     sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
                                       ^~~
   drivers/scsi/sd_zbc.c:665:4: note: in expansion of macro 'sd_printk'
       sd_printk(KERN_ERR, sdkp,
       ^~~~~~~~~
   drivers/scsi/sd_zbc.c:666:7: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned close zone request, start %zu/%zu"
          ^
   include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
     sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
                                       ^~~
   drivers/scsi/sd_zbc.c:665:4: note: in expansion of macro 'sd_printk'
       sd_printk(KERN_ERR, sdkp,
       ^~~~~~~~~
   drivers/scsi/sd_zbc.c:666:7: warning: format '%zu' expects argument of type 'size_t', but argument 7 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned close zone request, start %zu/%zu"
          ^
   include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
     sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
                                       ^~~
   drivers/scsi/sd_zbc.c:665:4: note: in expansion of macro 'sd_printk'
       sd_printk(KERN_ERR, sdkp,
       ^~~~~~~~~
   drivers/scsi/sd_zbc.c:666:7: warning: format '%zu' expects argument of type 'size_t', but argument 8 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned close zone request, start %zu/%zu"
          ^
   include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
     sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
                                       ^~~
   drivers/scsi/sd_zbc.c:665:4: note: in expansion of macro 'sd_printk'
       sd_printk(KERN_ERR, sdkp,
       ^~~~~~~~~
   In file included from include/linux/kernel.h:13:0,
                    from include/linux/sched.h:17,
                    from include/linux/blkdev.h:4,
                    from drivers/scsi/sd_zbc.c:24:
   drivers/scsi/sd_zbc.c: In function 'sd_zbc_setup_finish_cmnd':
>> drivers/scsi/sd_zbc.c:61:11: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
     pr_debug("%s %s [%s]: " fmt,    \
              ^
   include/linux/printk.h:260:21: note: in definition of macro 'pr_fmt'
    #define pr_fmt(fmt) fmt
                        ^~~
   include/linux/printk.h:308:2: note: in expansion of macro 'dynamic_pr_debug'
     dynamic_pr_debug(fmt, ##__VA_ARGS__)
     ^~~~~~~~~~~~~~~~
>> drivers/scsi/sd_zbc.c:61:2: note: in expansion of macro 'pr_debug'
     pr_debug("%s %s [%s]: " fmt,    \
     ^~~~~~~~
   drivers/scsi/sd_zbc.c:717:4: note: in expansion of macro 'sd_zbc_debug'
       sd_zbc_debug(sdkp,
       ^~~~~~~~~~~~
   In file included from drivers/scsi/sd_zbc.c:37:0:
   drivers/scsi/sd_zbc.c:734:7: warning: format '%zu' expects argument of type 'size_t', but argument 5 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned finish zone request, start %zu/%zu"
          ^
   drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
         (sdsk)->disk->disk_name, fmt, ##a) : \
                                  ^~~
   drivers/scsi/sd_zbc.c:734:7: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned finish zone request, start %zu/%zu"
          ^
   drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
         (sdsk)->disk->disk_name, fmt, ##a) : \
                                  ^~~
   drivers/scsi/sd_zbc.c:734:7: warning: format '%zu' expects argument of type 'size_t', but argument 7 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned finish zone request, start %zu/%zu"
          ^
   drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
         (sdsk)->disk->disk_name, fmt, ##a) : \
                                  ^~~
   drivers/scsi/sd_zbc.c:734:7: warning: format '%zu' expects argument of type 'size_t', but argument 8 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned finish zone request, start %zu/%zu"
          ^
   drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
         (sdsk)->disk->disk_name, fmt, ##a) : \
                                  ^~~
   In file included from include/scsi/scsi_cmnd.h:10:0,
                    from drivers/scsi/sd_zbc.c:30:
   drivers/scsi/sd_zbc.c:734:7: warning: format '%zu' expects argument of type 'size_t', but argument 5 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned finish zone request, start %zu/%zu"
          ^
   include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
     sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
                                       ^~~
   drivers/scsi/sd_zbc.c:733:4: note: in expansion of macro 'sd_printk'
       sd_printk(KERN_ERR, sdkp,
       ^~~~~~~~~
   drivers/scsi/sd_zbc.c:734:7: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned finish zone request, start %zu/%zu"
          ^
   include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
     sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
                                       ^~~
   drivers/scsi/sd_zbc.c:733:4: note: in expansion of macro 'sd_printk'
       sd_printk(KERN_ERR, sdkp,
       ^~~~~~~~~
   drivers/scsi/sd_zbc.c:734:7: warning: format '%zu' expects argument of type 'size_t', but argument 7 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned finish zone request, start %zu/%zu"
          ^
   include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
     sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
                                       ^~~
   drivers/scsi/sd_zbc.c:733:4: note: in expansion of macro 'sd_printk'
       sd_printk(KERN_ERR, sdkp,
       ^~~~~~~~~
   drivers/scsi/sd_zbc.c:734:7: warning: format '%zu' expects argument of type 'size_t', but argument 8 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned finish zone request, start %zu/%zu"
          ^
   include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
     sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
                                       ^~~
   drivers/scsi/sd_zbc.c:733:4: note: in expansion of macro 'sd_printk'
       sd_printk(KERN_ERR, sdkp,
       ^~~~~~~~~
   In file included from include/linux/kernel.h:13:0,
                    from include/linux/sched.h:17,
                    from include/linux/blkdev.h:4,
                    from drivers/scsi/sd_zbc.c:24:
   drivers/scsi/sd_zbc.c: In function 'sd_zbc_setup_read_write':
>> drivers/scsi/sd_zbc.c:61:11: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
     pr_debug("%s %s [%s]: " fmt,    \
              ^
   include/linux/printk.h:260:21: note: in definition of macro 'pr_fmt'
    #define pr_fmt(fmt) fmt
                        ^~~
   include/linux/printk.h:308:2: note: in expansion of macro 'dynamic_pr_debug'
     dynamic_pr_debug(fmt, ##__VA_ARGS__)
     ^~~~~~~~~~~~~~~~
>> drivers/scsi/sd_zbc.c:61:2: note: in expansion of macro 'pr_debug'
     pr_debug("%s %s [%s]: " fmt,    \
     ^~~~~~~~
   drivers/scsi/sd_zbc.c:783:3: note: in expansion of macro 'sd_zbc_debug'
      sd_zbc_debug(sdkp,
      ^~~~~~~~~~~~
   In file included from drivers/scsi/sd_zbc.c:37:0:
   drivers/scsi/sd_zbc.c: In function 'sd_zbc_read_zones':
   drivers/scsi/sd_zbc.c:985:8: warning: format '%zu' expects argument of type 'size_t', but argument 5 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
           "Changing capacity from %zu "
           ^
   drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
         (sdsk)->disk->disk_name, fmt, ##a) : \
                                  ^~~

vim +61 drivers/scsi/sd_zbc.c

    18	 * along with this program; see the file COPYING.  If not, write to
    19	 * the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139,
    20	 * USA.
    21	 *
    22	 */
    23	
  > 24	#include <linux/blkdev.h>
    25	#include <linux/rbtree.h>
    26	
    27	#include <asm/unaligned.h>
    28	
    29	#include <scsi/scsi.h>
  > 30	#include <scsi/scsi_cmnd.h>
    31	#include <scsi/scsi_dbg.h>
    32	#include <scsi/scsi_device.h>
    33	#include <scsi/scsi_driver.h>
    34	#include <scsi/scsi_host.h>
    35	#include <scsi/scsi_eh.h>
    36	
  > 37	#include "sd.h"
    38	#include "scsi_priv.h"
    39	
    40	enum zbc_zone_type {
    41		ZBC_ZONE_TYPE_CONV = 0x1,
    42		ZBC_ZONE_TYPE_SEQWRITE_REQ,
    43		ZBC_ZONE_TYPE_SEQWRITE_PREF,
    44		ZBC_ZONE_TYPE_RESERVED,
    45	};
    46	
    47	enum zbc_zone_cond {
    48		ZBC_ZONE_COND_NO_WP,
    49		ZBC_ZONE_COND_EMPTY,
    50		ZBC_ZONE_COND_IMP_OPEN,
    51		ZBC_ZONE_COND_EXP_OPEN,
    52		ZBC_ZONE_COND_CLOSED,
    53		ZBC_ZONE_COND_READONLY = 0xd,
    54		ZBC_ZONE_COND_FULL,
    55		ZBC_ZONE_COND_OFFLINE,
    56	};
    57	
    58	#define SD_ZBC_BUF_SIZE 131072
    59	
    60	#define sd_zbc_debug(sdkp, fmt, args...)			\
  > 61		pr_debug("%s %s [%s]: " fmt,				\
    62			 dev_driver_string(&(sdkp)->device->sdev_gendev), \
    63			 dev_name(&(sdkp)->device->sdev_gendev),	 \
    64			 (sdkp)->disk->disk_name, ## args)
    65	
    66	#define sd_zbc_debug_ratelimit(sdkp, fmt, args...)		\
    67		do {							\
    68			if (printk_ratelimit())				\
    69				sd_zbc_debug(sdkp, fmt, ## args);	\
    70		} while( 0 )
    71	
    72	#define sd_zbc_err(sdkp, fmt, args...)				\
  > 73		pr_err("%s %s [%s]: " fmt,				\
    74		       dev_driver_string(&(sdkp)->device->sdev_gendev),	\
    75		       dev_name(&(sdkp)->device->sdev_gendev),		\
    76		       (sdkp)->disk->disk_name, ## args)
    77	
    78	struct zbc_zone_work {
    79		struct work_struct 	zone_work;
    80		struct scsi_disk 	*sdkp;
    81		sector_t		sector;
    82		sector_t		nr_sects;
    83		bool 			init;
    84		unsigned int		nr_zones;
    85	};
    86	
    87	struct blk_zone *zbc_desc_to_zone(struct scsi_disk *sdkp, unsigned char *rec)
    88	{
    89		struct blk_zone *zone;
    90	
    91		zone = kzalloc(sizeof(struct blk_zone), GFP_KERNEL);
    92		if (!zone)
    93			return NULL;
    94	
    95		/* Zone type */
    96		switch(rec[0] & 0x0f) {
    97		case ZBC_ZONE_TYPE_CONV:
    98		case ZBC_ZONE_TYPE_SEQWRITE_REQ:
    99		case ZBC_ZONE_TYPE_SEQWRITE_PREF:
   100			zone->type = rec[0] & 0x0f;
   101			break;
   102		default:
   103			zone->type = BLK_ZONE_TYPE_UNKNOWN;
   104			break;
   105		}
   106	
   107		/* Zone condition */
   108		zone->cond = (rec[1] >> 4) & 0xf;
   109		if (rec[1] & 0x01)
   110			zone->reset = 1;
   111		if (rec[1] & 0x02)
   112			zone->non_seq = 1;
   113	
   114		/* Zone start sector and length */
   115		zone->len = logical_to_sectors(sdkp->device,
   116					       get_unaligned_be64(&rec[8]));
   117		zone->start = logical_to_sectors(sdkp->device,
   118						 get_unaligned_be64(&rec[16]));
   119	
   120		/* Zone write pointer */
   121		if (blk_zone_is_empty(zone) &&
   122		    zone->wp != zone->start)
   123			zone->wp = zone->start;
   124		else if (blk_zone_is_full(zone))
   125			zone->wp = zone->start + zone->len;
   126		else if (blk_zone_is_seq(zone))
   127			zone->wp = logical_to_sectors(sdkp->device,
   128						      get_unaligned_be64(&rec[24]));
   129		else
   130			zone->wp = (sector_t)-1;
   131	
   132		return zone;
   133	}
   134	
   135	static int zbc_parse_zones(struct scsi_disk *sdkp, unsigned char *buf,
   136				   unsigned int buf_len, sector_t *next_sector)
   137	{
   138		struct request_queue *q = sdkp->disk->queue;
   139		sector_t capacity = logical_to_sectors(sdkp->device, sdkp->capacity);
   140		unsigned char *rec = buf;
   141		unsigned int zone_len, list_length;
   142	
   143		/* Parse REPORT ZONES header */
   144		list_length = get_unaligned_be32(&buf[0]);
   145		rec = buf + 64;
   146		list_length += 64;
   147	
   148		if (list_length < buf_len)
   149			buf_len = list_length;
   150	
   151		/* Parse REPORT ZONES zone descriptors */
   152		*next_sector = capacity;
   153		while (rec < buf + buf_len) {
   154	
   155			struct blk_zone *new, *old;
   156	
   157			new = zbc_desc_to_zone(sdkp, rec);
   158			if (!new)
   159				return -ENOMEM;
   160	
   161			zone_len = new->len;
   162			*next_sector = new->start + zone_len;
   163	
   164			old = blk_insert_zone(q, new);
   165			if (old) {
   166				blk_lock_zone(old);
   167	
   168				/*
   169				 * Always update the zone state flags and the zone
   170				 * offline and read-only condition as the drive may
   171				 * change those independently of the commands being
   172				 * executed
   173				 */
   174				old->reset = new->reset;
   175				old->non_seq = new->non_seq;
   176				if (blk_zone_is_offline(new) ||
   177				    blk_zone_is_readonly(new))
   178					old->cond = new->cond;
   179	
   180				if (blk_zone_in_update(old)) {
   181					old->cond = new->cond;
   182					old->wp = new->wp;
   183					blk_clear_zone_update(old);
   184				}
   185	
   186				blk_unlock_zone(old);
   187	
   188				kfree(new);
   189			}
   190	
   191			rec += 64;
   192	
   193		}
   194	
   195		return 0;
   196	}
   197	
   198	/**
   199	 * sd_zbc_report_zones - Issue a REPORT ZONES scsi command
   200	 * @sdkp: SCSI disk to which the command should be send
   201	 * @buffer: response buffer
   202	 * @bufflen: length of @buffer
   203	 * @start_sector: logical sector for the zone information should be reported
   204	 * @option: reporting option to be used
   205	 * @partial: flag to set the 'partial' bit for report zones command
   206	 */
   207	int sd_zbc_report_zones(struct scsi_disk *sdkp, unsigned char *buffer,
   208				int bufflen, sector_t start_sector,
   209				enum zbc_zone_reporting_options option, bool partial)
   210	{
   211		struct scsi_device *sdp = sdkp->device;
   212		const int timeout = sdp->request_queue->rq_timeout;
   213		struct scsi_sense_hdr sshdr;
   214		sector_t start_lba = sectors_to_logical(sdkp->device, start_sector);
   215		unsigned char cmd[16];
   216		int result;
   217	
   218		if (!scsi_device_online(sdp))
   219			return -ENODEV;
   220	
 > 221		sd_zbc_debug(sdkp, "REPORT ZONES lba %zu len %d\n",
   222			     start_lba, bufflen);
   223	
   224		memset(cmd, 0, 16);
   225		cmd[0] = ZBC_IN;
   226		cmd[1] = ZI_REPORT_ZONES;
   227		put_unaligned_be64(start_lba, &cmd[2]);
   228		put_unaligned_be32(bufflen, &cmd[10]);
   229		cmd[14] = (partial ? ZBC_REPORT_ZONE_PARTIAL : 0) | option;
   230		memset(buffer, 0, bufflen);
   231	
   232		result = scsi_execute_req(sdp, cmd, DMA_FROM_DEVICE,
   233					buffer, bufflen, &sshdr,
   234					timeout, SD_MAX_RETRIES, NULL);
   235	
   236		if (result) {
 > 237			sd_zbc_err(sdkp,
   238				   "REPORT ZONES lba %zu failed with %d/%d\n",
   239				   start_lba, host_byte(result), driver_byte(result));
   240			return -EIO;

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 55864 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 8/9] sd: Implement support for ZBC devices
@ 2016-09-20  0:08     ` kbuild test robot
  0 siblings, 0 replies; 36+ messages in thread
From: kbuild test robot @ 2016-09-20  0:08 UTC (permalink / raw)
  Cc: kbuild-all, linux-scsi, linux-block, martin.petersen, axboe,
	hare, shaun.tancheff, Hannes Reinecke, Damien Le Moal

[-- Attachment #1: Type: text/plain, Size: 30989 bytes --]

Hi Hannes,

[auto build test WARNING on linus/master]
[also build test WARNING on v4.8-rc7]
[cannot apply to block/for-next next-20160919]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
[Suggest to use git(>=2.9.0) format-patch --base=<commit> (or --base=auto for convenience) to record what (public, well-known) commit your patch series was built on]
[Check https://git-scm.com/docs/git-format-patch for more information]

url:    https://github.com/0day-ci/linux/commits/Damien-Le-Moal/ZBC-Zoned-block-device-support/20160920-062608
config: i386-allmodconfig (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

All warnings (new ones prefixed by >>):

   In file included from include/linux/kernel.h:13:0,
                    from include/linux/sched.h:17,
                    from include/linux/blkdev.h:4,
                    from drivers/scsi/sd_zbc.c:24:
   drivers/scsi/sd_zbc.c: In function 'sd_zbc_report_zones':
>> drivers/scsi/sd_zbc.c:61:11: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
     pr_debug("%s %s [%s]: " fmt,    \
              ^
   include/linux/printk.h:260:21: note: in definition of macro 'pr_fmt'
    #define pr_fmt(fmt) fmt
                        ^~~
   include/linux/printk.h:308:2: note: in expansion of macro 'dynamic_pr_debug'
     dynamic_pr_debug(fmt, ##__VA_ARGS__)
     ^~~~~~~~~~~~~~~~
>> drivers/scsi/sd_zbc.c:61:2: note: in expansion of macro 'pr_debug'
     pr_debug("%s %s [%s]: " fmt,    \
     ^~~~~~~~
>> drivers/scsi/sd_zbc.c:221:2: note: in expansion of macro 'sd_zbc_debug'
     sd_zbc_debug(sdkp, "REPORT ZONES lba %zu len %d\n",
     ^~~~~~~~~~~~
   In file included from include/linux/printk.h:6:0,
                    from include/linux/kernel.h:13,
                    from include/linux/sched.h:17,
                    from include/linux/blkdev.h:4,
                    from drivers/scsi/sd_zbc.c:24:
>> include/linux/kern_levels.h:4:18: warning: format '%zu' expects argument of type 'size_t', but argument 5 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
    #define KERN_SOH "\001"  /* ASCII Start Of Header */
                     ^
   include/linux/kern_levels.h:10:18: note: in expansion of macro 'KERN_SOH'
    #define KERN_ERR KERN_SOH "3" /* error conditions */
                     ^~~~~~~~
   include/linux/printk.h:276:9: note: in expansion of macro 'KERN_ERR'
     printk(KERN_ERR pr_fmt(fmt), ##__VA_ARGS__)
            ^~~~~~~~
>> drivers/scsi/sd_zbc.c:73:2: note: in expansion of macro 'pr_err'
     pr_err("%s %s [%s]: " fmt,    \
     ^~~~~~
>> drivers/scsi/sd_zbc.c:237:3: note: in expansion of macro 'sd_zbc_err'
      sd_zbc_err(sdkp,
      ^~~~~~~~~~
   In file included from include/linux/kernel.h:13:0,
                    from include/linux/sched.h:17,
                    from include/linux/blkdev.h:4,
                    from drivers/scsi/sd_zbc.c:24:
   drivers/scsi/sd_zbc.c: In function 'sd_zbc_setup_reset_cmnd':
>> drivers/scsi/sd_zbc.c:61:11: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
     pr_debug("%s %s [%s]: " fmt,    \
              ^
   include/linux/printk.h:260:21: note: in definition of macro 'pr_fmt'
    #define pr_fmt(fmt) fmt
                        ^~~
   include/linux/printk.h:308:2: note: in expansion of macro 'dynamic_pr_debug'
     dynamic_pr_debug(fmt, ##__VA_ARGS__)
     ^~~~~~~~~~~~~~~~
>> drivers/scsi/sd_zbc.c:61:2: note: in expansion of macro 'pr_debug'
     pr_debug("%s %s [%s]: " fmt,    \
     ^~~~~~~~
   drivers/scsi/sd_zbc.c:489:4: note: in expansion of macro 'sd_zbc_debug'
       sd_zbc_debug(sdkp,
       ^~~~~~~~~~~~
   In file included from drivers/scsi/sd_zbc.c:37:0:
   drivers/scsi/sd_zbc.c:517:7: warning: format '%zu' expects argument of type 'size_t', but argument 5 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned reset wp request, start %zu/%zu"
          ^
   drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
         (sdsk)->disk->disk_name, fmt, ##a) : \
                                  ^~~
   drivers/scsi/sd_zbc.c:517:7: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned reset wp request, start %zu/%zu"
          ^
   drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
         (sdsk)->disk->disk_name, fmt, ##a) : \
                                  ^~~
   drivers/scsi/sd_zbc.c:517:7: warning: format '%zu' expects argument of type 'size_t', but argument 7 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned reset wp request, start %zu/%zu"
          ^
   drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
         (sdsk)->disk->disk_name, fmt, ##a) : \
                                  ^~~
   drivers/scsi/sd_zbc.c:517:7: warning: format '%zu' expects argument of type 'size_t', but argument 8 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned reset wp request, start %zu/%zu"
          ^
   drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
         (sdsk)->disk->disk_name, fmt, ##a) : \
                                  ^~~
   In file included from include/scsi/scsi_cmnd.h:10:0,
                    from drivers/scsi/sd_zbc.c:30:
   drivers/scsi/sd_zbc.c:517:7: warning: format '%zu' expects argument of type 'size_t', but argument 5 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned reset wp request, start %zu/%zu"
          ^
   include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
     sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
                                       ^~~
>> drivers/scsi/sd_zbc.c:516:4: note: in expansion of macro 'sd_printk'
       sd_printk(KERN_ERR, sdkp,
       ^~~~~~~~~
   drivers/scsi/sd_zbc.c:517:7: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned reset wp request, start %zu/%zu"
          ^
   include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
     sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
                                       ^~~
>> drivers/scsi/sd_zbc.c:516:4: note: in expansion of macro 'sd_printk'
       sd_printk(KERN_ERR, sdkp,
       ^~~~~~~~~
   drivers/scsi/sd_zbc.c:517:7: warning: format '%zu' expects argument of type 'size_t', but argument 7 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned reset wp request, start %zu/%zu"
          ^
   include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
     sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
                                       ^~~
>> drivers/scsi/sd_zbc.c:516:4: note: in expansion of macro 'sd_printk'
       sd_printk(KERN_ERR, sdkp,
       ^~~~~~~~~
   drivers/scsi/sd_zbc.c:517:7: warning: format '%zu' expects argument of type 'size_t', but argument 8 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned reset wp request, start %zu/%zu"
          ^
   include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
     sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
                                       ^~~
>> drivers/scsi/sd_zbc.c:516:4: note: in expansion of macro 'sd_printk'
       sd_printk(KERN_ERR, sdkp,
       ^~~~~~~~~
   In file included from include/linux/kernel.h:13:0,
                    from include/linux/sched.h:17,
                    from include/linux/blkdev.h:4,
                    from drivers/scsi/sd_zbc.c:24:
   drivers/scsi/sd_zbc.c: In function 'sd_zbc_setup_open_cmnd':
>> drivers/scsi/sd_zbc.c:61:11: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
     pr_debug("%s %s [%s]: " fmt,    \
              ^
   include/linux/printk.h:260:21: note: in definition of macro 'pr_fmt'
    #define pr_fmt(fmt) fmt
                        ^~~
   include/linux/printk.h:308:2: note: in expansion of macro 'dynamic_pr_debug'
     dynamic_pr_debug(fmt, ##__VA_ARGS__)
     ^~~~~~~~~~~~~~~~
>> drivers/scsi/sd_zbc.c:61:2: note: in expansion of macro 'pr_debug'
     pr_debug("%s %s [%s]: " fmt,    \
     ^~~~~~~~
   drivers/scsi/sd_zbc.c:573:4: note: in expansion of macro 'sd_zbc_debug'
       sd_zbc_debug(sdkp,
       ^~~~~~~~~~~~
   In file included from drivers/scsi/sd_zbc.c:37:0:
   drivers/scsi/sd_zbc.c:594:7: warning: format '%zu' expects argument of type 'size_t', but argument 5 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned open zone request, start %zu/%zu"
          ^
   drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
         (sdsk)->disk->disk_name, fmt, ##a) : \
                                  ^~~
   drivers/scsi/sd_zbc.c:594:7: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned open zone request, start %zu/%zu"
          ^
   drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
         (sdsk)->disk->disk_name, fmt, ##a) : \
                                  ^~~
   drivers/scsi/sd_zbc.c:594:7: warning: format '%zu' expects argument of type 'size_t', but argument 7 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned open zone request, start %zu/%zu"
          ^
   drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
         (sdsk)->disk->disk_name, fmt, ##a) : \
                                  ^~~
   drivers/scsi/sd_zbc.c:594:7: warning: format '%zu' expects argument of type 'size_t', but argument 8 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned open zone request, start %zu/%zu"
          ^
   drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
         (sdsk)->disk->disk_name, fmt, ##a) : \
                                  ^~~
   In file included from include/scsi/scsi_cmnd.h:10:0,
                    from drivers/scsi/sd_zbc.c:30:
   drivers/scsi/sd_zbc.c:594:7: warning: format '%zu' expects argument of type 'size_t', but argument 5 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned open zone request, start %zu/%zu"
          ^
   include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
     sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
                                       ^~~
   drivers/scsi/sd_zbc.c:593:4: note: in expansion of macro 'sd_printk'
       sd_printk(KERN_ERR, sdkp,
       ^~~~~~~~~
   drivers/scsi/sd_zbc.c:594:7: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned open zone request, start %zu/%zu"
          ^
   include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
     sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
                                       ^~~
   drivers/scsi/sd_zbc.c:593:4: note: in expansion of macro 'sd_printk'
       sd_printk(KERN_ERR, sdkp,
       ^~~~~~~~~
   drivers/scsi/sd_zbc.c:594:7: warning: format '%zu' expects argument of type 'size_t', but argument 7 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned open zone request, start %zu/%zu"
          ^
   include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
     sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
                                       ^~~
   drivers/scsi/sd_zbc.c:593:4: note: in expansion of macro 'sd_printk'
       sd_printk(KERN_ERR, sdkp,
       ^~~~~~~~~
   drivers/scsi/sd_zbc.c:594:7: warning: format '%zu' expects argument of type 'size_t', but argument 8 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned open zone request, start %zu/%zu"
          ^
   include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
     sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
                                       ^~~
   drivers/scsi/sd_zbc.c:593:4: note: in expansion of macro 'sd_printk'
       sd_printk(KERN_ERR, sdkp,
       ^~~~~~~~~
   In file included from include/linux/kernel.h:13:0,
                    from include/linux/sched.h:17,
                    from include/linux/blkdev.h:4,
                    from drivers/scsi/sd_zbc.c:24:
   drivers/scsi/sd_zbc.c: In function 'sd_zbc_setup_close_cmnd':
>> drivers/scsi/sd_zbc.c:61:11: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
     pr_debug("%s %s [%s]: " fmt,    \
              ^
   include/linux/printk.h:260:21: note: in definition of macro 'pr_fmt'
    #define pr_fmt(fmt) fmt
                        ^~~
   include/linux/printk.h:308:2: note: in expansion of macro 'dynamic_pr_debug'
     dynamic_pr_debug(fmt, ##__VA_ARGS__)
     ^~~~~~~~~~~~~~~~
>> drivers/scsi/sd_zbc.c:61:2: note: in expansion of macro 'pr_debug'
     pr_debug("%s %s [%s]: " fmt,    \
     ^~~~~~~~
   drivers/scsi/sd_zbc.c:645:4: note: in expansion of macro 'sd_zbc_debug'
       sd_zbc_debug(sdkp,
       ^~~~~~~~~~~~
   In file included from drivers/scsi/sd_zbc.c:37:0:
   drivers/scsi/sd_zbc.c:666:7: warning: format '%zu' expects argument of type 'size_t', but argument 5 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned close zone request, start %zu/%zu"
          ^
   drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
         (sdsk)->disk->disk_name, fmt, ##a) : \
                                  ^~~
   drivers/scsi/sd_zbc.c:666:7: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned close zone request, start %zu/%zu"
          ^
   drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
         (sdsk)->disk->disk_name, fmt, ##a) : \
                                  ^~~
   drivers/scsi/sd_zbc.c:666:7: warning: format '%zu' expects argument of type 'size_t', but argument 7 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned close zone request, start %zu/%zu"
          ^
   drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
         (sdsk)->disk->disk_name, fmt, ##a) : \
                                  ^~~
   drivers/scsi/sd_zbc.c:666:7: warning: format '%zu' expects argument of type 'size_t', but argument 8 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned close zone request, start %zu/%zu"
          ^
   drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
         (sdsk)->disk->disk_name, fmt, ##a) : \
                                  ^~~
   In file included from include/scsi/scsi_cmnd.h:10:0,
                    from drivers/scsi/sd_zbc.c:30:
   drivers/scsi/sd_zbc.c:666:7: warning: format '%zu' expects argument of type 'size_t', but argument 5 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned close zone request, start %zu/%zu"
          ^
   include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
     sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
                                       ^~~
   drivers/scsi/sd_zbc.c:665:4: note: in expansion of macro 'sd_printk'
       sd_printk(KERN_ERR, sdkp,
       ^~~~~~~~~
   drivers/scsi/sd_zbc.c:666:7: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned close zone request, start %zu/%zu"
          ^
   include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
     sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
                                       ^~~
   drivers/scsi/sd_zbc.c:665:4: note: in expansion of macro 'sd_printk'
       sd_printk(KERN_ERR, sdkp,
       ^~~~~~~~~
   drivers/scsi/sd_zbc.c:666:7: warning: format '%zu' expects argument of type 'size_t', but argument 7 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned close zone request, start %zu/%zu"
          ^
   include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
     sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
                                       ^~~
   drivers/scsi/sd_zbc.c:665:4: note: in expansion of macro 'sd_printk'
       sd_printk(KERN_ERR, sdkp,
       ^~~~~~~~~
   drivers/scsi/sd_zbc.c:666:7: warning: format '%zu' expects argument of type 'size_t', but argument 8 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned close zone request, start %zu/%zu"
          ^
   include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
     sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
                                       ^~~
   drivers/scsi/sd_zbc.c:665:4: note: in expansion of macro 'sd_printk'
       sd_printk(KERN_ERR, sdkp,
       ^~~~~~~~~
   In file included from include/linux/kernel.h:13:0,
                    from include/linux/sched.h:17,
                    from include/linux/blkdev.h:4,
                    from drivers/scsi/sd_zbc.c:24:
   drivers/scsi/sd_zbc.c: In function 'sd_zbc_setup_finish_cmnd':
>> drivers/scsi/sd_zbc.c:61:11: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
     pr_debug("%s %s [%s]: " fmt,    \
              ^
   include/linux/printk.h:260:21: note: in definition of macro 'pr_fmt'
    #define pr_fmt(fmt) fmt
                        ^~~
   include/linux/printk.h:308:2: note: in expansion of macro 'dynamic_pr_debug'
     dynamic_pr_debug(fmt, ##__VA_ARGS__)
     ^~~~~~~~~~~~~~~~
>> drivers/scsi/sd_zbc.c:61:2: note: in expansion of macro 'pr_debug'
     pr_debug("%s %s [%s]: " fmt,    \
     ^~~~~~~~
   drivers/scsi/sd_zbc.c:717:4: note: in expansion of macro 'sd_zbc_debug'
       sd_zbc_debug(sdkp,
       ^~~~~~~~~~~~
   In file included from drivers/scsi/sd_zbc.c:37:0:
   drivers/scsi/sd_zbc.c:734:7: warning: format '%zu' expects argument of type 'size_t', but argument 5 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned finish zone request, start %zu/%zu"
          ^
   drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
         (sdsk)->disk->disk_name, fmt, ##a) : \
                                  ^~~
   drivers/scsi/sd_zbc.c:734:7: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned finish zone request, start %zu/%zu"
          ^
   drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
         (sdsk)->disk->disk_name, fmt, ##a) : \
                                  ^~~
   drivers/scsi/sd_zbc.c:734:7: warning: format '%zu' expects argument of type 'size_t', but argument 7 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned finish zone request, start %zu/%zu"
          ^
   drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
         (sdsk)->disk->disk_name, fmt, ##a) : \
                                  ^~~
   drivers/scsi/sd_zbc.c:734:7: warning: format '%zu' expects argument of type 'size_t', but argument 8 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned finish zone request, start %zu/%zu"
          ^
   drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
         (sdsk)->disk->disk_name, fmt, ##a) : \
                                  ^~~
   In file included from include/scsi/scsi_cmnd.h:10:0,
                    from drivers/scsi/sd_zbc.c:30:
   drivers/scsi/sd_zbc.c:734:7: warning: format '%zu' expects argument of type 'size_t', but argument 5 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned finish zone request, start %zu/%zu"
          ^
   include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
     sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
                                       ^~~
   drivers/scsi/sd_zbc.c:733:4: note: in expansion of macro 'sd_printk'
       sd_printk(KERN_ERR, sdkp,
       ^~~~~~~~~
   drivers/scsi/sd_zbc.c:734:7: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned finish zone request, start %zu/%zu"
          ^
   include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
     sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
                                       ^~~
   drivers/scsi/sd_zbc.c:733:4: note: in expansion of macro 'sd_printk'
       sd_printk(KERN_ERR, sdkp,
       ^~~~~~~~~
   drivers/scsi/sd_zbc.c:734:7: warning: format '%zu' expects argument of type 'size_t', but argument 7 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned finish zone request, start %zu/%zu"
          ^
   include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
     sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
                                       ^~~
   drivers/scsi/sd_zbc.c:733:4: note: in expansion of macro 'sd_printk'
       sd_printk(KERN_ERR, sdkp,
       ^~~~~~~~~
   drivers/scsi/sd_zbc.c:734:7: warning: format '%zu' expects argument of type 'size_t', but argument 8 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
          "Unaligned finish zone request, start %zu/%zu"
          ^
   include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
     sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
                                       ^~~
   drivers/scsi/sd_zbc.c:733:4: note: in expansion of macro 'sd_printk'
       sd_printk(KERN_ERR, sdkp,
       ^~~~~~~~~
   In file included from include/linux/kernel.h:13:0,
                    from include/linux/sched.h:17,
                    from include/linux/blkdev.h:4,
                    from drivers/scsi/sd_zbc.c:24:
   drivers/scsi/sd_zbc.c: In function 'sd_zbc_setup_read_write':
>> drivers/scsi/sd_zbc.c:61:11: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
     pr_debug("%s %s [%s]: " fmt,    \
              ^
   include/linux/printk.h:260:21: note: in definition of macro 'pr_fmt'
    #define pr_fmt(fmt) fmt
                        ^~~
   include/linux/printk.h:308:2: note: in expansion of macro 'dynamic_pr_debug'
     dynamic_pr_debug(fmt, ##__VA_ARGS__)
     ^~~~~~~~~~~~~~~~
>> drivers/scsi/sd_zbc.c:61:2: note: in expansion of macro 'pr_debug'
     pr_debug("%s %s [%s]: " fmt,    \
     ^~~~~~~~
   drivers/scsi/sd_zbc.c:783:3: note: in expansion of macro 'sd_zbc_debug'
      sd_zbc_debug(sdkp,
      ^~~~~~~~~~~~
   In file included from drivers/scsi/sd_zbc.c:37:0:
   drivers/scsi/sd_zbc.c: In function 'sd_zbc_read_zones':
   drivers/scsi/sd_zbc.c:985:8: warning: format '%zu' expects argument of type 'size_t', but argument 5 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
           "Changing capacity from %zu "
           ^
   drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
         (sdsk)->disk->disk_name, fmt, ##a) : \
                                  ^~~

vim +61 drivers/scsi/sd_zbc.c

    18	 * along with this program; see the file COPYING.  If not, write to
    19	 * the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139,
    20	 * USA.
    21	 *
    22	 */
    23	
  > 24	#include <linux/blkdev.h>
    25	#include <linux/rbtree.h>
    26	
    27	#include <asm/unaligned.h>
    28	
    29	#include <scsi/scsi.h>
  > 30	#include <scsi/scsi_cmnd.h>
    31	#include <scsi/scsi_dbg.h>
    32	#include <scsi/scsi_device.h>
    33	#include <scsi/scsi_driver.h>
    34	#include <scsi/scsi_host.h>
    35	#include <scsi/scsi_eh.h>
    36	
  > 37	#include "sd.h"
    38	#include "scsi_priv.h"
    39	
    40	enum zbc_zone_type {
    41		ZBC_ZONE_TYPE_CONV = 0x1,
    42		ZBC_ZONE_TYPE_SEQWRITE_REQ,
    43		ZBC_ZONE_TYPE_SEQWRITE_PREF,
    44		ZBC_ZONE_TYPE_RESERVED,
    45	};
    46	
    47	enum zbc_zone_cond {
    48		ZBC_ZONE_COND_NO_WP,
    49		ZBC_ZONE_COND_EMPTY,
    50		ZBC_ZONE_COND_IMP_OPEN,
    51		ZBC_ZONE_COND_EXP_OPEN,
    52		ZBC_ZONE_COND_CLOSED,
    53		ZBC_ZONE_COND_READONLY = 0xd,
    54		ZBC_ZONE_COND_FULL,
    55		ZBC_ZONE_COND_OFFLINE,
    56	};
    57	
    58	#define SD_ZBC_BUF_SIZE 131072
    59	
    60	#define sd_zbc_debug(sdkp, fmt, args...)			\
  > 61		pr_debug("%s %s [%s]: " fmt,				\
    62			 dev_driver_string(&(sdkp)->device->sdev_gendev), \
    63			 dev_name(&(sdkp)->device->sdev_gendev),	 \
    64			 (sdkp)->disk->disk_name, ## args)
    65	
    66	#define sd_zbc_debug_ratelimit(sdkp, fmt, args...)		\
    67		do {							\
    68			if (printk_ratelimit())				\
    69				sd_zbc_debug(sdkp, fmt, ## args);	\
    70		} while( 0 )
    71	
    72	#define sd_zbc_err(sdkp, fmt, args...)				\
  > 73		pr_err("%s %s [%s]: " fmt,				\
    74		       dev_driver_string(&(sdkp)->device->sdev_gendev),	\
    75		       dev_name(&(sdkp)->device->sdev_gendev),		\
    76		       (sdkp)->disk->disk_name, ## args)
    77	
    78	struct zbc_zone_work {
    79		struct work_struct 	zone_work;
    80		struct scsi_disk 	*sdkp;
    81		sector_t		sector;
    82		sector_t		nr_sects;
    83		bool 			init;
    84		unsigned int		nr_zones;
    85	};
    86	
    87	struct blk_zone *zbc_desc_to_zone(struct scsi_disk *sdkp, unsigned char *rec)
    88	{
    89		struct blk_zone *zone;
    90	
    91		zone = kzalloc(sizeof(struct blk_zone), GFP_KERNEL);
    92		if (!zone)
    93			return NULL;
    94	
    95		/* Zone type */
    96		switch(rec[0] & 0x0f) {
    97		case ZBC_ZONE_TYPE_CONV:
    98		case ZBC_ZONE_TYPE_SEQWRITE_REQ:
    99		case ZBC_ZONE_TYPE_SEQWRITE_PREF:
   100			zone->type = rec[0] & 0x0f;
   101			break;
   102		default:
   103			zone->type = BLK_ZONE_TYPE_UNKNOWN;
   104			break;
   105		}
   106	
   107		/* Zone condition */
   108		zone->cond = (rec[1] >> 4) & 0xf;
   109		if (rec[1] & 0x01)
   110			zone->reset = 1;
   111		if (rec[1] & 0x02)
   112			zone->non_seq = 1;
   113	
   114		/* Zone start sector and length */
   115		zone->len = logical_to_sectors(sdkp->device,
   116					       get_unaligned_be64(&rec[8]));
   117		zone->start = logical_to_sectors(sdkp->device,
   118						 get_unaligned_be64(&rec[16]));
   119	
   120		/* Zone write pointer */
   121		if (blk_zone_is_empty(zone) &&
   122		    zone->wp != zone->start)
   123			zone->wp = zone->start;
   124		else if (blk_zone_is_full(zone))
   125			zone->wp = zone->start + zone->len;
   126		else if (blk_zone_is_seq(zone))
   127			zone->wp = logical_to_sectors(sdkp->device,
   128						      get_unaligned_be64(&rec[24]));
   129		else
   130			zone->wp = (sector_t)-1;
   131	
   132		return zone;
   133	}
   134	
   135	static int zbc_parse_zones(struct scsi_disk *sdkp, unsigned char *buf,
   136				   unsigned int buf_len, sector_t *next_sector)
   137	{
   138		struct request_queue *q = sdkp->disk->queue;
   139		sector_t capacity = logical_to_sectors(sdkp->device, sdkp->capacity);
   140		unsigned char *rec = buf;
   141		unsigned int zone_len, list_length;
   142	
   143		/* Parse REPORT ZONES header */
   144		list_length = get_unaligned_be32(&buf[0]);
   145		rec = buf + 64;
   146		list_length += 64;
   147	
   148		if (list_length < buf_len)
   149			buf_len = list_length;
   150	
   151		/* Parse REPORT ZONES zone descriptors */
   152		*next_sector = capacity;
   153		while (rec < buf + buf_len) {
   154	
   155			struct blk_zone *new, *old;
   156	
   157			new = zbc_desc_to_zone(sdkp, rec);
   158			if (!new)
   159				return -ENOMEM;
   160	
   161			zone_len = new->len;
   162			*next_sector = new->start + zone_len;
   163	
   164			old = blk_insert_zone(q, new);
   165			if (old) {
   166				blk_lock_zone(old);
   167	
   168				/*
   169				 * Always update the zone state flags and the zone
   170				 * offline and read-only condition as the drive may
   171				 * change those independently of the commands being
   172				 * executed
   173				 */
   174				old->reset = new->reset;
   175				old->non_seq = new->non_seq;
   176				if (blk_zone_is_offline(new) ||
   177				    blk_zone_is_readonly(new))
   178					old->cond = new->cond;
   179	
   180				if (blk_zone_in_update(old)) {
   181					old->cond = new->cond;
   182					old->wp = new->wp;
   183					blk_clear_zone_update(old);
   184				}
   185	
   186				blk_unlock_zone(old);
   187	
   188				kfree(new);
   189			}
   190	
   191			rec += 64;
   192	
   193		}
   194	
   195		return 0;
   196	}
   197	
   198	/**
   199	 * sd_zbc_report_zones - Issue a REPORT ZONES scsi command
   200	 * @sdkp: SCSI disk to which the command should be send
   201	 * @buffer: response buffer
   202	 * @bufflen: length of @buffer
   203	 * @start_sector: logical sector for the zone information should be reported
   204	 * @option: reporting option to be used
   205	 * @partial: flag to set the 'partial' bit for report zones command
   206	 */
   207	int sd_zbc_report_zones(struct scsi_disk *sdkp, unsigned char *buffer,
   208				int bufflen, sector_t start_sector,
   209				enum zbc_zone_reporting_options option, bool partial)
   210	{
   211		struct scsi_device *sdp = sdkp->device;
   212		const int timeout = sdp->request_queue->rq_timeout;
   213		struct scsi_sense_hdr sshdr;
   214		sector_t start_lba = sectors_to_logical(sdkp->device, start_sector);
   215		unsigned char cmd[16];
   216		int result;
   217	
   218		if (!scsi_device_online(sdp))
   219			return -ENODEV;
   220	
 > 221		sd_zbc_debug(sdkp, "REPORT ZONES lba %zu len %d\n",
   222			     start_lba, bufflen);
   223	
   224		memset(cmd, 0, 16);
   225		cmd[0] = ZBC_IN;
   226		cmd[1] = ZI_REPORT_ZONES;
   227		put_unaligned_be64(start_lba, &cmd[2]);
   228		put_unaligned_be32(bufflen, &cmd[10]);
   229		cmd[14] = (partial ? ZBC_REPORT_ZONE_PARTIAL : 0) | option;
   230		memset(buffer, 0, bufflen);
   231	
   232		result = scsi_execute_req(sdp, cmd, DMA_FROM_DEVICE,
   233					buffer, bufflen, &sshdr,
   234					timeout, SD_MAX_RETRIES, NULL);
   235	
   236		if (result) {
 > 237			sd_zbc_err(sdkp,
   238				   "REPORT ZONES lba %zu failed with %d/%d\n",
   239				   start_lba, host_byte(result), driver_byte(result));
   240			return -EIO;

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 55864 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 9/9] blk-zoned: Add ioctl interface for zone operations
  2016-09-19 21:27   ` Damien Le Moal
@ 2016-09-20  2:39     ` kbuild test robot
  -1 siblings, 0 replies; 36+ messages in thread
From: kbuild test robot @ 2016-09-20  2:39 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: kbuild-all, linux-scsi, linux-block, martin.petersen, axboe,
	hare, shaun.tancheff, Damien Le Moal

[-- Attachment #1: Type: text/plain, Size: 3299 bytes --]

Hi Shaun,

[auto build test ERROR on linus/master]
[also build test ERROR on v4.8-rc7]
[cannot apply to block/for-next next-20160919]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
[Suggest to use git(>=2.9.0) format-patch --base=<commit> (or --base=auto for convenience) to record what (public, well-known) commit your patch series was built on]
[Check https://git-scm.com/docs/git-format-patch for more information]

url:    https://github.com/0day-ci/linux/commits/Damien-Le-Moal/ZBC-Zoned-block-device-support/20160920-062608
config: blackfin-allyesconfig (attached as .config)
compiler: bfin-uclinux-gcc (GCC) 6.2.0
reproduce:
        wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=blackfin 

All error/warnings (new ones prefixed by >>):

   In file included from include/linux/linkage.h:4:0,
                    from include/linux/kernel.h:6,
                    from block/blk-zoned.c:11:
   In function 'blkdev_zone_action_ioctl',
       inlined from 'blkdev_zone_ioctl' at block/blk-zoned.c:445:7:
>> include/linux/compiler.h:491:38: error: call to '__compiletime_assert_382' declared with attribute error: BUILD_BUG_ON failed: ptr_size >= 8
     _compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)
                                         ^
   include/linux/compiler.h:474:4: note: in definition of macro '__compiletime_assert'
       prefix ## suffix();    \
       ^~~~~~
   include/linux/compiler.h:491:2: note: in expansion of macro '_compiletime_assert'
     _compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)
     ^~~~~~~~~~~~~~~~~~~
   include/linux/bug.h:51:37: note: in expansion of macro 'compiletime_assert'
    #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
                                        ^~~~~~~~~~~~~~~~~~
   include/linux/bug.h:75:2: note: in expansion of macro 'BUILD_BUG_ON_MSG'
     BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition)
     ^~~~~~~~~~~~~~~~
>> arch/blackfin/include/asm/uaccess.h:136:3: note: in expansion of macro 'BUILD_BUG_ON'
      BUILD_BUG_ON(ptr_size >= 8);   \
      ^~~~~~~~~~~~
>> block/blk-zoned.c:382:6: note: in expansion of macro 'get_user'
     if (get_user(sector, (u64 __user *)argp))
         ^~~~~~~~

vim +/get_user +382 block/blk-zoned.c

   366		z.reset = zone->reset;
   367	
   368		blk_unlock_zone(zone);
   369	
   370		if (copy_to_user(argp, &z, sizeof(struct blkzone)))
   371			return -EFAULT;
   372	
   373		return 0;
   374	}
   375	
   376	static int blkdev_zone_action_ioctl(struct block_device *bdev,
   377					    unsigned cmd, void __user *argp)
   378	{
   379		unsigned int op;
   380		u64 sector;
   381	
 > 382		if (get_user(sector, (u64 __user *)argp))
   383			return -EFAULT;
   384	
   385		switch (cmd) {
   386		case BLKRESETZONE:
   387			op = REQ_OP_ZONE_RESET;
   388			break;
   389		case BLKOPENZONE:
   390			op = REQ_OP_ZONE_OPEN;

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 41390 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 9/9] blk-zoned: Add ioctl interface for zone operations
@ 2016-09-20  2:39     ` kbuild test robot
  0 siblings, 0 replies; 36+ messages in thread
From: kbuild test robot @ 2016-09-20  2:39 UTC (permalink / raw)
  Cc: kbuild-all, linux-scsi, linux-block, martin.petersen, axboe,
	hare, shaun.tancheff, Damien Le Moal

[-- Attachment #1: Type: text/plain, Size: 3299 bytes --]

Hi Shaun,

[auto build test ERROR on linus/master]
[also build test ERROR on v4.8-rc7]
[cannot apply to block/for-next next-20160919]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
[Suggest to use git(>=2.9.0) format-patch --base=<commit> (or --base=auto for convenience) to record what (public, well-known) commit your patch series was built on]
[Check https://git-scm.com/docs/git-format-patch for more information]

url:    https://github.com/0day-ci/linux/commits/Damien-Le-Moal/ZBC-Zoned-block-device-support/20160920-062608
config: blackfin-allyesconfig (attached as .config)
compiler: bfin-uclinux-gcc (GCC) 6.2.0
reproduce:
        wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=blackfin 

All error/warnings (new ones prefixed by >>):

   In file included from include/linux/linkage.h:4:0,
                    from include/linux/kernel.h:6,
                    from block/blk-zoned.c:11:
   In function 'blkdev_zone_action_ioctl',
       inlined from 'blkdev_zone_ioctl' at block/blk-zoned.c:445:7:
>> include/linux/compiler.h:491:38: error: call to '__compiletime_assert_382' declared with attribute error: BUILD_BUG_ON failed: ptr_size >= 8
     _compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)
                                         ^
   include/linux/compiler.h:474:4: note: in definition of macro '__compiletime_assert'
       prefix ## suffix();    \
       ^~~~~~
   include/linux/compiler.h:491:2: note: in expansion of macro '_compiletime_assert'
     _compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)
     ^~~~~~~~~~~~~~~~~~~
   include/linux/bug.h:51:37: note: in expansion of macro 'compiletime_assert'
    #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
                                        ^~~~~~~~~~~~~~~~~~
   include/linux/bug.h:75:2: note: in expansion of macro 'BUILD_BUG_ON_MSG'
     BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition)
     ^~~~~~~~~~~~~~~~
>> arch/blackfin/include/asm/uaccess.h:136:3: note: in expansion of macro 'BUILD_BUG_ON'
      BUILD_BUG_ON(ptr_size >= 8);   \
      ^~~~~~~~~~~~
>> block/blk-zoned.c:382:6: note: in expansion of macro 'get_user'
     if (get_user(sector, (u64 __user *)argp))
         ^~~~~~~~

vim +/get_user +382 block/blk-zoned.c

   366		z.reset = zone->reset;
   367	
   368		blk_unlock_zone(zone);
   369	
   370		if (copy_to_user(argp, &z, sizeof(struct blkzone)))
   371			return -EFAULT;
   372	
   373		return 0;
   374	}
   375	
   376	static int blkdev_zone_action_ioctl(struct block_device *bdev,
   377					    unsigned cmd, void __user *argp)
   378	{
   379		unsigned int op;
   380		u64 sector;
   381	
 > 382		if (get_user(sector, (u64 __user *)argp))
   383			return -EFAULT;
   384	
   385		switch (cmd) {
   386		case BLKRESETZONE:
   387			op = REQ_OP_ZONE_RESET;
   388			break;
   389		case BLKOPENZONE:
   390			op = REQ_OP_ZONE_OPEN;

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 41390 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 1/9] block: Add 'zoned' queue limit
  2016-09-19 21:27   ` Damien Le Moal
@ 2016-09-20  4:05     ` Bart Van Assche
  -1 siblings, 0 replies; 36+ messages in thread
From: Bart Van Assche @ 2016-09-20  4:05 UTC (permalink / raw)
  To: Damien Le Moal, linux-scsi, linux-block
  Cc: martin.petersen, axboe, hare, shaun.tancheff

On 09/19/16 14:27, Damien Le Moal wrote:
> +/*
> + * Zoned block device models (zoned limit).
> + */
> +enum blk_zoned_model {
> +	BLK_ZONED_NONE,	/* Regular block device */
> +	BLK_ZONED_HA, 	/* Host-aware zoned block device */
> +	BLK_ZONED_HM,	/* Host-managed zoned block device */
> +};

[ ... ]

> +static inline unsigned int blk_queue_zoned(struct request_queue *q)
> +{
> +	return q->limits.zoned;
> +}
> +
>  /*
>   * We regard a request as sync, if either a read or a sync write
>   */
> @@ -1354,6 +1369,16 @@ static inline unsigned int bdev_write_same(struct block_device *bdev)
>  	return 0;
>  }
>
> +static inline unsigned int bdev_zoned(struct block_device *bdev)
> +{
> +	struct request_queue *q = bdev_get_queue(bdev);
> +
> +	if (q)
> +		return blk_queue_zoned(q);
> +
> +	return 0;
> +}

Hello Damien,

Please consider changing the return type of the above two functions into 
"enum blk_zoned_model" to make it clear that both return one of the 
BLK_ZONED_* constants.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 1/9] block: Add 'zoned' queue limit
@ 2016-09-20  4:05     ` Bart Van Assche
  0 siblings, 0 replies; 36+ messages in thread
From: Bart Van Assche @ 2016-09-20  4:05 UTC (permalink / raw)
  To: Damien Le Moal, linux-scsi, linux-block
  Cc: martin.petersen, axboe, hare, shaun.tancheff

On 09/19/16 14:27, Damien Le Moal wrote:
> +/*
> + * Zoned block device models (zoned limit).
> + */
> +enum blk_zoned_model {
> +	BLK_ZONED_NONE,	/* Regular block device */
> +	BLK_ZONED_HA, 	/* Host-aware zoned block device */
> +	BLK_ZONED_HM,	/* Host-managed zoned block device */
> +};

[ ... ]

> +static inline unsigned int blk_queue_zoned(struct request_queue *q)
> +{
> +	return q->limits.zoned;
> +}
> +
>  /*
>   * We regard a request as sync, if either a read or a sync write
>   */
> @@ -1354,6 +1369,16 @@ static inline unsigned int bdev_write_same(struct block_device *bdev)
>  	return 0;
>  }
>
> +static inline unsigned int bdev_zoned(struct block_device *bdev)
> +{
> +	struct request_queue *q = bdev_get_queue(bdev);
> +
> +	if (q)
> +		return blk_queue_zoned(q);
> +
> +	return 0;
> +}

Hello Damien,

Please consider changing the return type of the above two functions into 
"enum blk_zoned_model" to make it clear that both return one of the 
BLK_ZONED_* constants.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 4/9] block: Define zoned block device operations
  2016-09-19 21:27   ` Damien Le Moal
@ 2016-09-20  4:05     ` Bart Van Assche
  -1 siblings, 0 replies; 36+ messages in thread
From: Bart Van Assche @ 2016-09-20  4:05 UTC (permalink / raw)
  To: Damien Le Moal, linux-scsi, linux-block
  Cc: martin.petersen, axboe, hare, shaun.tancheff

On 09/19/16 14:27, Damien Le Moal wrote:
> diff --git a/block/blk-core.c b/block/blk-core.c
> index 36c7ac3..4a7f7ba 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -1941,6 +1941,13 @@ generic_make_request_checks(struct bio *bio)
>  	case REQ_OP_WRITE_SAME:
>  		if (!bdev_write_same(bio->bi_bdev))
>  			goto not_supported;
> +	case REQ_OP_ZONE_REPORT:
> +	case REQ_OP_ZONE_RESET:
> +	case REQ_OP_ZONE_OPEN:
> +	case REQ_OP_ZONE_CLOSE:
> +	case REQ_OP_ZONE_FINISH:
> +		if (!bdev_zoned(bio->bi_bdev))
> +			goto not_supported;

In patch 1/9 the BLK_ZONED_* constants have been introduced. The above 
code compares the bdev_zoned() return value against 0. That means that 
the above code will break if the numeric value of the BLK_ZONED_* 
constants would be changed. Please change the above code such that it 
compares the bdev_zoned() return value against a BLK_ZONED_* constant.

> + * Operation:
> + * REQ_OP_ZONE_REPORT: Request information for all zones or for a single zone.
> + * REQ_OP_ZONE_RESET: Reset the write pointer of all zones or of a single zone.
> + * REQ_OP_ZONE_OPEN: Explicitely open the maximum allowed number of zones or
> + *                   a single zone. For the former case, the zones that will
> + *                   actually be open are chosen by the disk.
> + * REQ_OP_ZONE_CLOSE: Close all implicitely or explicitely open zones or
> + *                    a single zone.
> + * REQ_OP_ZONE_FINISH: Transition one or all open and closed zones to the full
> + *                     condition.

Please change *plicitely into *plicitly.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 4/9] block: Define zoned block device operations
@ 2016-09-20  4:05     ` Bart Van Assche
  0 siblings, 0 replies; 36+ messages in thread
From: Bart Van Assche @ 2016-09-20  4:05 UTC (permalink / raw)
  To: Damien Le Moal, linux-scsi, linux-block
  Cc: martin.petersen, axboe, hare, shaun.tancheff

On 09/19/16 14:27, Damien Le Moal wrote:
> diff --git a/block/blk-core.c b/block/blk-core.c
> index 36c7ac3..4a7f7ba 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -1941,6 +1941,13 @@ generic_make_request_checks(struct bio *bio)
>  	case REQ_OP_WRITE_SAME:
>  		if (!bdev_write_same(bio->bi_bdev))
>  			goto not_supported;
> +	case REQ_OP_ZONE_REPORT:
> +	case REQ_OP_ZONE_RESET:
> +	case REQ_OP_ZONE_OPEN:
> +	case REQ_OP_ZONE_CLOSE:
> +	case REQ_OP_ZONE_FINISH:
> +		if (!bdev_zoned(bio->bi_bdev))
> +			goto not_supported;

In patch 1/9 the BLK_ZONED_* constants have been introduced. The above 
code compares the bdev_zoned() return value against 0. That means that 
the above code will break if the numeric value of the BLK_ZONED_* 
constants would be changed. Please change the above code such that it 
compares the bdev_zoned() return value against a BLK_ZONED_* constant.

> + * Operation:
> + * REQ_OP_ZONE_REPORT: Request information for all zones or for a single zone.
> + * REQ_OP_ZONE_RESET: Reset the write pointer of all zones or of a single zone.
> + * REQ_OP_ZONE_OPEN: Explicitely open the maximum allowed number of zones or
> + *                   a single zone. For the former case, the zones that will
> + *                   actually be open are chosen by the disk.
> + * REQ_OP_ZONE_CLOSE: Close all implicitely or explicitely open zones or
> + *                    a single zone.
> + * REQ_OP_ZONE_FINISH: Transition one or all open and closed zones to the full
> + *                     condition.

Please change *plicitely into *plicitly.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 5/9] block: Implement support for zoned block devices
  2016-09-19 21:27   ` Damien Le Moal
@ 2016-09-20  4:18     ` Bart Van Assche
  -1 siblings, 0 replies; 36+ messages in thread
From: Bart Van Assche @ 2016-09-20  4:18 UTC (permalink / raw)
  To: Damien Le Moal, linux-scsi, linux-block
  Cc: martin.petersen, axboe, hare, shaun.tancheff

On 09/19/16 14:27, Damien Le Moal wrote:
> +	/*
> +	 * Make sure bi_size does not overflow because
> +	 * of some weird very large zone size.
> +	 */
> +	if (nr_sects && (unsigned long long)nr_sects << 9 > UINT_MAX)
> +		return -EINVAL;
> +
> +	bio = bio_alloc(gfp_mask, 1);
> +	if (!bio)
> +		return -ENOMEM;
> +
> +	bio->bi_iter.bi_sector = sector;
> +	bio->bi_iter.bi_size = nr_sects << 9;
> +	bio->bi_vcnt = 0;
> +	bio->bi_bdev = bdev;
> +	bio_set_op_attrs(bio, op, 0);

Hello Damien and Hannes,

nr_sects is cast to unsigned long long for the overflow test but not 
when assigning bi_size. To me this looks like an inconsistency. Please 
make both expressions consistent.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 5/9] block: Implement support for zoned block devices
@ 2016-09-20  4:18     ` Bart Van Assche
  0 siblings, 0 replies; 36+ messages in thread
From: Bart Van Assche @ 2016-09-20  4:18 UTC (permalink / raw)
  To: Damien Le Moal, linux-scsi, linux-block
  Cc: martin.petersen, axboe, hare, shaun.tancheff

On 09/19/16 14:27, Damien Le Moal wrote:
> +	/*
> +	 * Make sure bi_size does not overflow because
> +	 * of some weird very large zone size.
> +	 */
> +	if (nr_sects && (unsigned long long)nr_sects << 9 > UINT_MAX)
> +		return -EINVAL;
> +
> +	bio = bio_alloc(gfp_mask, 1);
> +	if (!bio)
> +		return -ENOMEM;
> +
> +	bio->bi_iter.bi_sector = sector;
> +	bio->bi_iter.bi_size = nr_sects << 9;
> +	bio->bi_vcnt = 0;
> +	bio->bi_bdev = bdev;
> +	bio_set_op_attrs(bio, op, 0);

Hello Damien and Hannes,

nr_sects is cast to unsigned long long for the overflow test but not 
when assigning bi_size. To me this looks like an inconsistency. Please 
make both expressions consistent.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 8/9] sd: Implement support for ZBC devices
  2016-09-19 21:27   ` Damien Le Moal
@ 2016-09-20  5:40     ` Shaun Tancheff
  -1 siblings, 0 replies; 36+ messages in thread
From: Shaun Tancheff @ 2016-09-20  5:40 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: linux-scsi, linux-block, Martin K. Petersen, Jens Axboe,
	Hannes Reinecke, Hannes Reinecke

On Mon, Sep 19, 2016 at 4:27 PM, Damien Le Moal <damien.lemoal@hgst.com> wr=
ote:
> From: Hannes Reinecke <hare@suse.com>
>
> Implement ZBC support functions to setup zoned disks and fill the
> block device zone information tree during the device scan. The
> zone information tree is also always updated on disk revalidation.
> This adds support for the REQ_OP_ZONE* operations and also implements
> the new RESET_WP provisioning mode so that discard requests can be
> mapped to the RESET WRITE POINTER command for devices with a constant
> zone size.
>
> The capacity read of the device triggers the zone information read
> for zoned block devices. As this needs the device zone model, the
> the call to sd_read_capacity is moved after the call to
> sd_read_block_characteristics so that host-aware devices are
> properlly initialized. The call to sd_zbc_read_zones in
> sd_read_capacity may change the device capacity obtained with
> the sd_read_capacity_16 function for devices reporting only the
> capacity of conventional zones at the beginning of the LBA range
> (i.e. devices with rc_basis et to 0).
>
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
> ---
>  drivers/scsi/Makefile     |    1 +
>  drivers/scsi/sd.c         |  147 ++++--
>  drivers/scsi/sd.h         |   68 +++
>  drivers/scsi/sd_zbc.c     | 1097 +++++++++++++++++++++++++++++++++++++++=
++++++
>  include/scsi/scsi_proto.h |   17 +
>  5 files changed, 1304 insertions(+), 26 deletions(-)
>  create mode 100644 drivers/scsi/sd_zbc.c
>
> diff --git a/drivers/scsi/Makefile b/drivers/scsi/Makefile
> index d539798..fabcb6d 100644
> --- a/drivers/scsi/Makefile
> +++ b/drivers/scsi/Makefile
> @@ -179,6 +179,7 @@ hv_storvsc-y                        :=3D storvsc_drv.=
o
>
>  sd_mod-objs    :=3D sd.o
>  sd_mod-$(CONFIG_BLK_DEV_INTEGRITY) +=3D sd_dif.o
> +sd_mod-$(CONFIG_BLK_DEV_ZONED) +=3D sd_zbc.o
>
>  sr_mod-objs    :=3D sr.o sr_ioctl.o sr_vendor.o
>  ncr53c8xx-flags-$(CONFIG_SCSI_ZALON) \
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index d3e852a..46b8b78 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -92,6 +92,7 @@ MODULE_ALIAS_BLOCKDEV_MAJOR(SCSI_DISK15_MAJOR);
>  MODULE_ALIAS_SCSI_DEVICE(TYPE_DISK);
>  MODULE_ALIAS_SCSI_DEVICE(TYPE_MOD);
>  MODULE_ALIAS_SCSI_DEVICE(TYPE_RBC);
> +MODULE_ALIAS_SCSI_DEVICE(TYPE_ZBC);
>
>  #if !defined(CONFIG_DEBUG_BLOCK_EXT_DEVT)
>  #define SD_MINORS      16
> @@ -99,7 +100,6 @@ MODULE_ALIAS_SCSI_DEVICE(TYPE_RBC);
>  #define SD_MINORS      0
>  #endif
>
> -static void sd_config_discard(struct scsi_disk *, unsigned int);
>  static void sd_config_write_same(struct scsi_disk *);
>  static int  sd_revalidate_disk(struct gendisk *);
>  static void sd_unlock_native_capacity(struct gendisk *disk);
> @@ -162,7 +162,7 @@ cache_type_store(struct device *dev, struct device_at=
tribute *attr,
>         static const char temp[] =3D "temporary ";
>         int len;
>
> -       if (sdp->type !=3D TYPE_DISK)
> +       if (sdp->type !=3D TYPE_DISK && sdp->type !=3D TYPE_ZBC)
>                 /* no cache control on RBC devices; theoretically they
>                  * can do it, but there's probably so many exceptions
>                  * it's not worth the risk */
> @@ -261,7 +261,7 @@ allow_restart_store(struct device *dev, struct device=
_attribute *attr,
>         if (!capable(CAP_SYS_ADMIN))
>                 return -EACCES;
>
> -       if (sdp->type !=3D TYPE_DISK)
> +       if (sdp->type !=3D TYPE_DISK && sdp->type !=3D TYPE_ZBC)
>                 return -EINVAL;
>
>         sdp->allow_restart =3D simple_strtoul(buf, NULL, 10);
> @@ -369,6 +369,7 @@ static const char *lbp_mode[] =3D {
>         [SD_LBP_WS16]           =3D "writesame_16",
>         [SD_LBP_WS10]           =3D "writesame_10",
>         [SD_LBP_ZERO]           =3D "writesame_zero",
> +       [SD_ZBC_RESET_WP]       =3D "reset_wp",
>         [SD_LBP_DISABLE]        =3D "disabled",
>  };
>
> @@ -391,6 +392,13 @@ provisioning_mode_store(struct device *dev, struct d=
evice_attribute *attr,
>         if (!capable(CAP_SYS_ADMIN))
>                 return -EACCES;
>
> +       if (sdkp->zoned =3D=3D 1 || sdp->type =3D=3D TYPE_ZBC) {
> +               if (!strncmp(buf, lbp_mode[SD_ZBC_RESET_WP], 20)) {
> +                       sd_config_discard(sdkp, SD_ZBC_RESET_WP);
> +                       return count;
> +               }
> +               return -EINVAL;
> +       }
>         if (sdp->type !=3D TYPE_DISK)
>                 return -EINVAL;
>
> @@ -458,7 +466,7 @@ max_write_same_blocks_store(struct device *dev, struc=
t device_attribute *attr,
>         if (!capable(CAP_SYS_ADMIN))
>                 return -EACCES;
>
> -       if (sdp->type !=3D TYPE_DISK)
> +       if (sdp->type !=3D TYPE_DISK && sdp->type !=3D TYPE_ZBC)
>                 return -EINVAL;
>
>         err =3D kstrtoul(buf, 10, &max);
> @@ -631,7 +639,7 @@ static unsigned char sd_setup_protect_cmnd(struct scs=
i_cmnd *scmd,
>         return protect;
>  }
>
> -static void sd_config_discard(struct scsi_disk *sdkp, unsigned int mode)
> +void sd_config_discard(struct scsi_disk *sdkp, unsigned int mode)
>  {
>         struct request_queue *q =3D sdkp->disk->queue;
>         unsigned int logical_block_size =3D sdkp->device->sector_size;
> @@ -683,6 +691,11 @@ static void sd_config_discard(struct scsi_disk *sdkp=
, unsigned int mode)
>                 q->limits.discard_zeroes_data =3D sdkp->lbprz;
>                 break;
>
> +       case SD_ZBC_RESET_WP:
> +               max_blocks =3D min_not_zero(sdkp->max_unmap_blocks,
> +                                         (u32)SD_MAX_WS16_BLOCKS);
> +               break;
> +
>         case SD_LBP_ZERO:
>                 max_blocks =3D min_not_zero(sdkp->max_ws_blocks,
>                                           (u32)SD_MAX_WS10_BLOCKS);
> @@ -711,16 +724,20 @@ static int sd_setup_discard_cmnd(struct scsi_cmnd *=
cmd)
>         unsigned int nr_sectors =3D blk_rq_sectors(rq);
>         unsigned int nr_bytes =3D blk_rq_bytes(rq);
>         unsigned int len;
> -       int ret;
> +       int ret =3D BLKPREP_OK;
>         char *buf;
> -       struct page *page;
> +       struct page *page =3D NULL;
>
>         sector >>=3D ilog2(sdp->sector_size) - 9;
>         nr_sectors >>=3D ilog2(sdp->sector_size) - 9;
>
> -       page =3D alloc_page(GFP_ATOMIC | __GFP_ZERO);
> -       if (!page)
> -               return BLKPREP_DEFER;
> +       if (sdkp->provisioning_mode !=3D SD_ZBC_RESET_WP) {
> +               page =3D alloc_page(GFP_ATOMIC | __GFP_ZERO);
> +               if (!page)
> +                       return BLKPREP_DEFER;
> +       }
> +
> +       rq->completion_data =3D page;
>
>         switch (sdkp->provisioning_mode) {
>         case SD_LBP_UNMAP:
> @@ -760,12 +777,19 @@ static int sd_setup_discard_cmnd(struct scsi_cmnd *=
cmd)
>                 len =3D sdkp->device->sector_size;
>                 break;
>
> +       case SD_ZBC_RESET_WP:
> +               ret =3D sd_zbc_setup_reset_cmnd(cmd);
> +               if (ret !=3D BLKPREP_OK)
> +                       goto out;
> +               /* Reset Write Pointer doesn't have a payload */
> +               len =3D 0;
> +               break;
> +
>         default:
>                 ret =3D BLKPREP_INVALID;
>                 goto out;
>         }
>
> -       rq->completion_data =3D page;
>         rq->timeout =3D SD_TIMEOUT;
>
>         cmd->transfersize =3D len;
> @@ -779,13 +803,17 @@ static int sd_setup_discard_cmnd(struct scsi_cmnd *=
cmd)
>          * discarded on disk. This allows us to report completion on the =
full
>          * amount of blocks described by the request.
>          */
> -       blk_add_request_payload(rq, page, 0, len);
> -       ret =3D scsi_init_io(cmd);
> +       if (len) {
> +               blk_add_request_payload(rq, page, 0, len);
> +               ret =3D scsi_init_io(cmd);
> +       }
>         rq->__data_len =3D nr_bytes;
>
>  out:
> -       if (ret !=3D BLKPREP_OK)
> +       if (page && ret !=3D BLKPREP_OK) {
> +               rq->completion_data =3D NULL;
>                 __free_page(page);
> +       }
>         return ret;
>  }
>
> @@ -843,6 +871,13 @@ static int sd_setup_write_same_cmnd(struct scsi_cmnd=
 *cmd)
>
>         BUG_ON(bio_offset(bio) || bio_iovec(bio).bv_len !=3D sdp->sector_=
size);
>
> +       if (sdkp->zoned =3D=3D 1 || sdp->type =3D=3D TYPE_ZBC) {
> +               /* sd_zbc_setup_read_write uses block layer sector units =
*/
> +               ret =3D sd_zbc_setup_read_write(sdkp, rq, sector, &nr_sec=
tors);
> +               if (ret !=3D BLKPREP_OK)
> +                       return ret;
> +       }
> +
>         sector >>=3D ilog2(sdp->sector_size) - 9;
>         nr_sectors >>=3D ilog2(sdp->sector_size) - 9;
>
> @@ -962,6 +997,13 @@ static int sd_setup_read_write_cmnd(struct scsi_cmnd=
 *SCpnt)
>         SCSI_LOG_HLQUEUE(2, scmd_printk(KERN_INFO, SCpnt, "block=3D%llu\n=
",
>                                         (unsigned long long)block));
>
> +       if (sdkp->zoned =3D=3D 1 || sdp->type =3D=3D TYPE_ZBC) {
> +               /* sd_zbc_setup_read_write uses block layer sector units =
*/
> +               ret =3D sd_zbc_setup_read_write(sdkp, rq, block, &this_co=
unt);
> +               if (ret !=3D BLKPREP_OK)
> +                       goto out;
> +       }
> +
>         /*
>          * If we have a 1K hardware sectorsize, prevent access to single
>          * 512 byte sectors.  In theory we could handle this - in fact
> @@ -1148,6 +1190,16 @@ static int sd_init_command(struct scsi_cmnd *cmd)
>         case REQ_OP_READ:
>         case REQ_OP_WRITE:
>                 return sd_setup_read_write_cmnd(cmd);
> +       case REQ_OP_ZONE_REPORT:
> +               return sd_zbc_setup_report_cmnd(cmd);
> +       case REQ_OP_ZONE_RESET:
> +               return sd_zbc_setup_reset_cmnd(cmd);
> +       case REQ_OP_ZONE_OPEN:
> +               return sd_zbc_setup_open_cmnd(cmd);
> +       case REQ_OP_ZONE_CLOSE:
> +               return sd_zbc_setup_close_cmnd(cmd);
> +       case REQ_OP_ZONE_FINISH:
> +               return sd_zbc_setup_finish_cmnd(cmd);
>         default:
>                 BUG();
>         }
> @@ -1157,7 +1209,8 @@ static void sd_uninit_command(struct scsi_cmnd *SCp=
nt)
>  {
>         struct request *rq =3D SCpnt->request;
>
> -       if (req_op(rq) =3D=3D REQ_OP_DISCARD)
> +       if (req_op(rq) =3D=3D REQ_OP_DISCARD &&
> +           rq->completion_data)
>                 __free_page(rq->completion_data);
>
>         if (SCpnt->cmnd !=3D rq->cmd) {
> @@ -1778,8 +1831,16 @@ static int sd_done(struct scsi_cmnd *SCpnt)
>         int sense_deferred =3D 0;
>         unsigned char op =3D SCpnt->cmnd[0];
>         unsigned char unmap =3D SCpnt->cmnd[1] & 8;
> +       unsigned char sa =3D SCpnt->cmnd[1] & 0xf;
>
> -       if (req_op(req) =3D=3D REQ_OP_DISCARD || req_op(req) =3D=3D REQ_O=
P_WRITE_SAME) {
> +       switch(req_op(req)) {
> +       case REQ_OP_DISCARD:
> +       case REQ_OP_WRITE_SAME:
> +       case REQ_OP_ZONE_REPORT:
> +       case REQ_OP_ZONE_RESET:
> +       case REQ_OP_ZONE_OPEN:
> +       case REQ_OP_ZONE_CLOSE:
> +       case REQ_OP_ZONE_FINISH:
>                 if (!result) {
>                         good_bytes =3D blk_rq_bytes(req);
>                         scsi_set_resid(SCpnt, 0);
> @@ -1787,6 +1848,7 @@ static int sd_done(struct scsi_cmnd *SCpnt)
>                         good_bytes =3D 0;
>                         scsi_set_resid(SCpnt, blk_rq_bytes(req));
>                 }
> +               break;
>         }
>
>         if (result) {
> @@ -1829,6 +1891,10 @@ static int sd_done(struct scsi_cmnd *SCpnt)
>                         case UNMAP:
>                                 sd_config_discard(sdkp, SD_LBP_DISABLE);
>                                 break;
> +                       case ZBC_OUT:
> +                               if (sa =3D=3D ZO_RESET_WRITE_POINTER)
> +                                       sd_config_discard(sdkp, SD_LBP_DI=
SABLE);
> +                               break;
>                         case WRITE_SAME_16:
>                         case WRITE_SAME:
>                                 if (unmap)
> @@ -1847,7 +1913,11 @@ static int sd_done(struct scsi_cmnd *SCpnt)
>         default:
>                 break;
>         }
> +
>   out:
> +       if (sdkp->zoned =3D=3D 1 || sdkp->device->type =3D=3D TYPE_ZBC)
> +               sd_zbc_done(SCpnt, &sshdr);
> +
>         SCSI_LOG_HLCOMPLETE(1, scmd_printk(KERN_INFO, SCpnt,
>                                            "sd_done: completed %d of %d b=
ytes\n",
>                                            good_bytes, scsi_bufflen(SCpnt=
)));
> @@ -1982,7 +2052,6 @@ sd_spinup_disk(struct scsi_disk *sdkp)
>         }
>  }
>
> -
>  /*
>   * Determine whether disk supports Data Integrity Field.
>   */
> @@ -2132,6 +2201,9 @@ static int read_capacity_16(struct scsi_disk *sdkp,=
 struct scsi_device *sdp,
>         /* Logical blocks per physical block exponent */
>         sdkp->physical_block_size =3D (1 << (buffer[13] & 0xf)) * sector_=
size;
>
> +       /* RC basis */
> +       sdkp->rc_basis =3D (buffer[12] >> 4) & 0x3;
> +
>         /* Lowest aligned logical block */
>         alignment =3D ((buffer[14] & 0x3f) << 8 | buffer[15]) * sector_si=
ze;
>         blk_queue_alignment_offset(sdp->request_queue, alignment);
> @@ -2322,6 +2394,11 @@ got_data:
>                 sector_size =3D 512;
>         }
>         blk_queue_logical_block_size(sdp->request_queue, sector_size);
> +       blk_queue_physical_block_size(sdp->request_queue,
> +                                     sdkp->physical_block_size);
> +       sdkp->device->sector_size =3D sector_size;
> +
> +       sd_zbc_read_zones(sdkp, buffer);
>
>         {
>                 char cap_str_2[10], cap_str_10[10];
> @@ -2348,9 +2425,6 @@ got_data:
>         if (sdkp->capacity > 0xffffffff)
>                 sdp->use_16_for_rw =3D 1;
>
> -       blk_queue_physical_block_size(sdp->request_queue,
> -                                     sdkp->physical_block_size);
> -       sdkp->device->sector_size =3D sector_size;
>  }
>
>  /* called with buffer of length 512 */
> @@ -2612,7 +2686,7 @@ static void sd_read_app_tag_own(struct scsi_disk *s=
dkp, unsigned char *buffer)
>         struct scsi_mode_data data;
>         struct scsi_sense_hdr sshdr;
>
> -       if (sdp->type !=3D TYPE_DISK)
> +       if (sdp->type !=3D TYPE_DISK && sdp->type !=3D TYPE_ZBC)
>                 return;
>
>         if (sdkp->protection_type =3D=3D 0)
> @@ -2719,6 +2793,7 @@ static void sd_read_block_limits(struct scsi_disk *=
sdkp)
>   */
>  static void sd_read_block_characteristics(struct scsi_disk *sdkp)
>  {
> +       struct request_queue *q =3D sdkp->disk->queue;
>         unsigned char *buffer;
>         u16 rot;
>         const int vpd_len =3D 64;
> @@ -2733,10 +2808,21 @@ static void sd_read_block_characteristics(struct =
scsi_disk *sdkp)
>         rot =3D get_unaligned_be16(&buffer[4]);
>
>         if (rot =3D=3D 1) {
> -               queue_flag_set_unlocked(QUEUE_FLAG_NONROT, sdkp->disk->qu=
eue);
> -               queue_flag_clear_unlocked(QUEUE_FLAG_ADD_RANDOM, sdkp->di=
sk->queue);
> +               queue_flag_set_unlocked(QUEUE_FLAG_NONROT, q);
> +               queue_flag_clear_unlocked(QUEUE_FLAG_ADD_RANDOM, q);
>         }
>
> +       sdkp->zoned =3D (buffer[8] >> 4) & 3;
> +       if (sdkp->zoned =3D=3D 1)
> +               q->limits.zoned =3D BLK_ZONED_HA;
> +       else if (sdkp->device->type =3D=3D TYPE_ZBC)
> +               q->limits.zoned =3D BLK_ZONED_HM;
> +       else
> +               q->limits.zoned =3D BLK_ZONED_NONE;
> +       if (blk_queue_zoned(q) && sdkp->first_scan)
> +               sd_printk(KERN_NOTICE, sdkp, "Host-%s zoned block device\=
n",
> +                         q->limits.zoned =3D=3D BLK_ZONED_HM ? "managed"=
 : "aware");
> +
>   out:
>         kfree(buffer);
>  }
> @@ -2835,14 +2921,14 @@ static int sd_revalidate_disk(struct gendisk *dis=
k)
>          * react badly if we do.
>          */
>         if (sdkp->media_present) {
> -               sd_read_capacity(sdkp, buffer);
> -
>                 if (scsi_device_supports_vpd(sdp)) {
>                         sd_read_block_provisioning(sdkp);
>                         sd_read_block_limits(sdkp);
>                         sd_read_block_characteristics(sdkp);
>                 }
>
> +               sd_read_capacity(sdkp, buffer);
> +
>                 sd_read_write_protect_flag(sdkp, buffer);
>                 sd_read_cache_type(sdkp, buffer);
>                 sd_read_app_tag_own(sdkp, buffer);
> @@ -3040,9 +3126,16 @@ static int sd_probe(struct device *dev)
>
>         scsi_autopm_get_device(sdp);
>         error =3D -ENODEV;
> -       if (sdp->type !=3D TYPE_DISK && sdp->type !=3D TYPE_MOD && sdp->t=
ype !=3D TYPE_RBC)
> +       if (sdp->type !=3D TYPE_DISK &&
> +           sdp->type !=3D TYPE_ZBC &&
> +           sdp->type !=3D TYPE_MOD &&
> +           sdp->type !=3D TYPE_RBC)
>                 goto out;
>
> +#ifndef CONFIG_BLK_DEV_ZONED
> +       if (sdp->type =3D=3D TYPE_ZBC)
> +               goto out;
> +#endif
>         SCSI_LOG_HLQUEUE(3, sdev_printk(KERN_INFO, sdp,
>                                         "sd_probe\n"));
>
> @@ -3146,6 +3239,8 @@ static int sd_remove(struct device *dev)
>         del_gendisk(sdkp->disk);
>         sd_shutdown(dev);
>
> +       sd_zbc_remove(sdkp);
> +
>         blk_register_region(devt, SD_MINORS, NULL,
>                             sd_default_probe, NULL, NULL);
>
> diff --git a/drivers/scsi/sd.h b/drivers/scsi/sd.h
> index 765a6f1..3452871 100644
> --- a/drivers/scsi/sd.h
> +++ b/drivers/scsi/sd.h
> @@ -56,6 +56,7 @@ enum {
>         SD_LBP_WS16,            /* Use WRITE SAME(16) with UNMAP bit */
>         SD_LBP_WS10,            /* Use WRITE SAME(10) with UNMAP bit */
>         SD_LBP_ZERO,            /* Use WRITE SAME(10) with zero payload *=
/
> +       SD_ZBC_RESET_WP,        /* Use RESET WRITE POINTER */
>         SD_LBP_DISABLE,         /* Discard disabled due to failed cmd */
>  };
>

Can we have adding SD_ZBC_RESET_WP as a separate patch?


> @@ -64,6 +65,11 @@ struct scsi_disk {
>         struct scsi_device *device;
>         struct device   dev;
>         struct gendisk  *disk;
> +#ifdef CONFIG_BLK_DEV_ZONED
> +       struct workqueue_struct *zone_work_q;
> +       sector_t zone_sectors;
> +       unsigned int nr_zones;
> +#endif
>         atomic_t        openers;
>         sector_t        capacity;       /* size in logical blocks */
>         u32             max_xfer_blocks;
> @@ -94,6 +100,8 @@ struct scsi_disk {
>         unsigned        lbpvpd : 1;
>         unsigned        ws10 : 1;
>         unsigned        ws16 : 1;
> +       unsigned        rc_basis: 2;
> +       unsigned        zoned: 2;
>  };
>  #define to_scsi_disk(obj) container_of(obj,struct scsi_disk,dev)
>
> @@ -156,6 +164,13 @@ static inline unsigned int logical_to_bytes(struct s=
csi_device *sdev, sector_t b
>         return blocks * sdev->sector_size;
>  }
>
> +static inline sector_t sectors_to_logical(struct scsi_device *sdev, sect=
or_t sector)
> +{
> +       return sector >> (ilog2(sdev->sector_size) - 9);
> +}
> +
> +extern void sd_config_discard(struct scsi_disk *, unsigned int);
> +
>  /*
>   * A DIF-capable target device can be formatted with different
>   * protection schemes.  Currently 0 through 3 are defined:
> @@ -269,4 +284,57 @@ static inline void sd_dif_complete(struct scsi_cmnd =
*cmd, unsigned int a)
>
>  #endif /* CONFIG_BLK_DEV_INTEGRITY */
>
> +#ifdef CONFIG_BLK_DEV_ZONED
> +
> +extern void sd_zbc_read_zones(struct scsi_disk *, char *);
> +extern void sd_zbc_remove(struct scsi_disk *);
> +extern int sd_zbc_setup_read_write(struct scsi_disk *, struct request *,
> +                                  sector_t, unsigned int *);
> +extern int sd_zbc_setup_report_cmnd(struct scsi_cmnd *);
> +extern int sd_zbc_setup_reset_cmnd(struct scsi_cmnd *);
> +extern int sd_zbc_setup_open_cmnd(struct scsi_cmnd *);
> +extern int sd_zbc_setup_close_cmnd(struct scsi_cmnd *);
> +extern int sd_zbc_setup_finish_cmnd(struct scsi_cmnd *);
> +extern void sd_zbc_done(struct scsi_cmnd *, struct scsi_sense_hdr *);
> +
> +#else /* CONFIG_BLK_DEV_ZONED */
> +
> +static inline void sd_zbc_read_zones(struct scsi_disk *sdkp,
> +                                    unsigned char *buf) {}
> +static inline void sd_zbc_remove(struct scsi_disk *sdkp) {}
> +
> +static inline int sd_zbc_setup_read_write(struct scsi_disk *sdkp,
> +                                         struct request *rq, sector_t se=
ctor,
> +                                         unsigned int *num_sectors)
> +{
> +       /* Let the drive fail requests */
> +       return BLKPREP_OK;
> +}
> +
> +static inline int sd_zbc_setup_report_cmnd(struct scsi_cmnd *cmd)
> +{
> +       return BLKPREP_KILL;
> +}
> +static inline int sd_zbc_setup_reset_cmnd(struct scsi_cmnd *cmd)
> +{
> +       return BLKPREP_KILL;
> +}
> +static inline int sd_zbc_setup_open_cmnd(struct scsi_cmnd *cmd)
> +{
> +       return BLKPREP_KILL;
> +}
> +static inline int sd_zbc_setup_close_cmnd(struct scsi_cmnd *cmd)
> +{
> +       return BLKPREP_KILL;
> +}
> +static inline int sd_zbc_setup_finish_cmnd(struct scsi_cmnd *cmd)
> +{
> +       return BLKPREP_KILL;
> +}
> +
> +static inline void sd_zbc_done(struct scsi_cmnd *cmd,
> +                              struct scsi_sense_hdr *sshdr) {}
> +
> +#endif /* CONFIG_BLK_DEV_ZONED */
> +
>  #endif /* _SCSI_DISK_H */
> diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
> new file mode 100644
> index 0000000..ec9c3fc
> --- /dev/null
> +++ b/drivers/scsi/sd_zbc.c
> @@ -0,0 +1,1097 @@
> +/*
> + * SCSI Zoned Block commands
> + *
> + * Copyright (C) 2014-2015 SUSE Linux GmbH
> + * Written by: Hannes Reinecke <hare@suse.de>
> + * Modified by: Damien Le Moal <damien.lemoal@hgst.com>
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License version
> + * 2 as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful, but
> + * WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; see the file COPYING.  If not, write to
> + * the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139,
> + * USA.
> + *
> + */
> +
> +#include <linux/blkdev.h>
> +#include <linux/rbtree.h>
> +
> +#include <asm/unaligned.h>
> +
> +#include <scsi/scsi.h>
> +#include <scsi/scsi_cmnd.h>
> +#include <scsi/scsi_dbg.h>
> +#include <scsi/scsi_device.h>
> +#include <scsi/scsi_driver.h>
> +#include <scsi/scsi_host.h>
> +#include <scsi/scsi_eh.h>
> +
> +#include "sd.h"
> +#include "scsi_priv.h"
> +
> +enum zbc_zone_type {
> +       ZBC_ZONE_TYPE_CONV =3D 0x1,
> +       ZBC_ZONE_TYPE_SEQWRITE_REQ,
> +       ZBC_ZONE_TYPE_SEQWRITE_PREF,
> +       ZBC_ZONE_TYPE_RESERVED,
> +};
> +
> +enum zbc_zone_cond {
> +       ZBC_ZONE_COND_NO_WP,
> +       ZBC_ZONE_COND_EMPTY,
> +       ZBC_ZONE_COND_IMP_OPEN,
> +       ZBC_ZONE_COND_EXP_OPEN,
> +       ZBC_ZONE_COND_CLOSED,
> +       ZBC_ZONE_COND_READONLY =3D 0xd,
> +       ZBC_ZONE_COND_FULL,
> +       ZBC_ZONE_COND_OFFLINE,
> +};
> +
> +#define SD_ZBC_BUF_SIZE 131072
> +
> +#define sd_zbc_debug(sdkp, fmt, args...)                       \
> +       pr_debug("%s %s [%s]: " fmt,                            \
> +                dev_driver_string(&(sdkp)->device->sdev_gendev), \
> +                dev_name(&(sdkp)->device->sdev_gendev),         \
> +                (sdkp)->disk->disk_name, ## args)
> +
> +#define sd_zbc_debug_ratelimit(sdkp, fmt, args...)             \
> +       do {                                                    \
> +               if (printk_ratelimit())                         \
> +                       sd_zbc_debug(sdkp, fmt, ## args);       \
> +       } while( 0 )
> +
> +#define sd_zbc_err(sdkp, fmt, args...)                         \
> +       pr_err("%s %s [%s]: " fmt,                              \
> +              dev_driver_string(&(sdkp)->device->sdev_gendev), \
> +              dev_name(&(sdkp)->device->sdev_gendev),          \
> +              (sdkp)->disk->disk_name, ## args)
> +
> +struct zbc_zone_work {
> +       struct work_struct      zone_work;
> +       struct scsi_disk        *sdkp;
> +       sector_t                sector;
> +       sector_t                nr_sects;
> +       bool                    init;
> +       unsigned int            nr_zones;
> +};
> +
> +struct blk_zone *zbc_desc_to_zone(struct scsi_disk *sdkp, unsigned char =
*rec)
> +{
> +       struct blk_zone *zone;
> +
> +       zone =3D kzalloc(sizeof(struct blk_zone), GFP_KERNEL);
> +       if (!zone)
> +               return NULL;
> +
> +       /* Zone type */
> +       switch(rec[0] & 0x0f) {
> +       case ZBC_ZONE_TYPE_CONV:
> +       case ZBC_ZONE_TYPE_SEQWRITE_REQ:
> +       case ZBC_ZONE_TYPE_SEQWRITE_PREF:
> +               zone->type =3D rec[0] & 0x0f;
> +               break;
> +       default:
> +               zone->type =3D BLK_ZONE_TYPE_UNKNOWN;
> +               break;
> +       }
> +
> +       /* Zone condition */
> +       zone->cond =3D (rec[1] >> 4) & 0xf;
> +       if (rec[1] & 0x01)
> +               zone->reset =3D 1;
> +       if (rec[1] & 0x02)
> +               zone->non_seq =3D 1;
> +
> +       /* Zone start sector and length */
> +       zone->len =3D logical_to_sectors(sdkp->device,
> +                                      get_unaligned_be64(&rec[8]));
> +       zone->start =3D logical_to_sectors(sdkp->device,
> +                                        get_unaligned_be64(&rec[16]));
> +
> +       /* Zone write pointer */
> +       if (blk_zone_is_empty(zone) &&
> +           zone->wp !=3D zone->start)
> +               zone->wp =3D zone->start;
> +       else if (blk_zone_is_full(zone))
> +               zone->wp =3D zone->start + zone->len;
> +       else if (blk_zone_is_seq(zone))
> +               zone->wp =3D logical_to_sectors(sdkp->device,
> +                                             get_unaligned_be64(&rec[24]=
));
> +       else
> +               zone->wp =3D (sector_t)-1;
> +
> +       return zone;
> +}
> +
> +static int zbc_parse_zones(struct scsi_disk *sdkp, unsigned char *buf,
> +                          unsigned int buf_len, sector_t *next_sector)
> +{
> +       struct request_queue *q =3D sdkp->disk->queue;
> +       sector_t capacity =3D logical_to_sectors(sdkp->device, sdkp->capa=
city);
> +       unsigned char *rec =3D buf;
> +       unsigned int zone_len, list_length;
> +
> +       /* Parse REPORT ZONES header */
> +       list_length =3D get_unaligned_be32(&buf[0]);
> +       rec =3D buf + 64;
> +       list_length +=3D 64;
> +
> +       if (list_length < buf_len)
> +               buf_len =3D list_length;
> +
> +       /* Parse REPORT ZONES zone descriptors */
> +       *next_sector =3D capacity;
> +       while (rec < buf + buf_len) {
> +
> +               struct blk_zone *new, *old;
> +
> +               new =3D zbc_desc_to_zone(sdkp, rec);
> +               if (!new)
> +                       return -ENOMEM;
> +
> +               zone_len =3D new->len;
> +               *next_sector =3D new->start + zone_len;
> +
> +               old =3D blk_insert_zone(q, new);
> +               if (old) {
> +                       blk_lock_zone(old);
> +
> +                       /*
> +                        * Always update the zone state flags and the zon=
e
> +                        * offline and read-only condition as the drive m=
ay
> +                        * change those independently of the commands bei=
ng
> +                        * executed
> +                        */
> +                       old->reset =3D new->reset;
> +                       old->non_seq =3D new->non_seq;
> +                       if (blk_zone_is_offline(new) ||
> +                           blk_zone_is_readonly(new))
> +                               old->cond =3D new->cond;
> +
> +                       if (blk_zone_in_update(old)) {
> +                               old->cond =3D new->cond;
> +                               old->wp =3D new->wp;
> +                               blk_clear_zone_update(old);
> +                       }
> +
> +                       blk_unlock_zone(old);
> +
> +                       kfree(new);
> +               }
> +
> +               rec +=3D 64;
> +
> +       }
> +
> +       return 0;
> +}
> +
> +/**
> + * sd_zbc_report_zones - Issue a REPORT ZONES scsi command
> + * @sdkp: SCSI disk to which the command should be send
> + * @buffer: response buffer
> + * @bufflen: length of @buffer
> + * @start_sector: logical sector for the zone information should be repo=
rted
> + * @option: reporting option to be used
> + * @partial: flag to set the 'partial' bit for report zones command
> + */
> +int sd_zbc_report_zones(struct scsi_disk *sdkp, unsigned char *buffer,
> +                       int bufflen, sector_t start_sector,
> +                       enum zbc_zone_reporting_options option, bool part=
ial)
> +{
> +       struct scsi_device *sdp =3D sdkp->device;
> +       const int timeout =3D sdp->request_queue->rq_timeout;
> +       struct scsi_sense_hdr sshdr;
> +       sector_t start_lba =3D sectors_to_logical(sdkp->device, start_sec=
tor);
> +       unsigned char cmd[16];
> +       int result;
> +
> +       if (!scsi_device_online(sdp))
> +               return -ENODEV;
> +
> +       sd_zbc_debug(sdkp, "REPORT ZONES lba %zu len %d\n",
> +                    start_lba, bufflen);
> +
> +       memset(cmd, 0, 16);
> +       cmd[0] =3D ZBC_IN;
> +       cmd[1] =3D ZI_REPORT_ZONES;
> +       put_unaligned_be64(start_lba, &cmd[2]);
> +       put_unaligned_be32(bufflen, &cmd[10]);
> +       cmd[14] =3D (partial ? ZBC_REPORT_ZONE_PARTIAL : 0) | option;
> +       memset(buffer, 0, bufflen);
> +
> +       result =3D scsi_execute_req(sdp, cmd, DMA_FROM_DEVICE,
> +                               buffer, bufflen, &sshdr,
> +                               timeout, SD_MAX_RETRIES, NULL);
> +
> +       if (result) {
> +               sd_zbc_err(sdkp,
> +                          "REPORT ZONES lba %zu failed with %d/%d\n",
> +                          start_lba, host_byte(result), driver_byte(resu=
lt));
> +               return -EIO;
> +       }
> +
> +       return 0;
> +}
> +
> +/**
> + * Set or clear the update flag of all zones contained
> + * in the range sector..sector+nr_sects.
> + * Return the number of zones marked/cleared.
> + */
> +static int __sd_zbc_zones_updating(struct scsi_disk *sdkp,
> +                                  sector_t sector, sector_t nr_sects,
> +                                  bool set)
> +{
> +       struct request_queue *q =3D sdkp->disk->queue;
> +       struct blk_zone *zone;
> +       struct rb_node *node;
> +       unsigned long flags;
> +       int nr_zones =3D 0;
> +
> +       if (!nr_sects) {
> +               /* All zones */
> +               sector =3D 0;
> +               nr_sects =3D logical_to_sectors(sdkp->device, sdkp->capac=
ity);
> +       }
> +
> +       spin_lock_irqsave(&q->zones_lock, flags);
> +       for (node =3D rb_first(&q->zones); node && nr_sects; node =3D rb_=
next(node)) {
> +               zone =3D rb_entry(node, struct blk_zone, node);
> +               if (sector < zone->start || sector >=3D (zone->start + zo=
ne->len))
> +                       continue;
> +               if (set) {
> +                       if (!test_and_set_bit_lock(BLK_ZONE_IN_UPDATE, &z=
one->flags))
> +                               nr_zones++;
> +               } else if (test_and_clear_bit(BLK_ZONE_IN_UPDATE, &zone->=
flags)) {
> +                       wake_up_bit(&zone->flags, BLK_ZONE_IN_UPDATE);
> +                       nr_zones++;
> +               }
> +               sector =3D zone->start + zone->len;
> +               if (nr_sects <=3D zone->len)
> +                       nr_sects =3D 0;
> +               else
> +                       nr_sects -=3D zone->len;
> +       }
> +       spin_unlock_irqrestore(&q->zones_lock, flags);
> +
> +       return nr_zones;
> +}
> +
> +static inline int sd_zbc_set_zones_updating(struct scsi_disk *sdkp,
> +                                           sector_t sector, sector_t nr_=
sects)
> +{
> +       return __sd_zbc_zones_updating(sdkp, sector, nr_sects, true);
> +}
> +
> +static inline int sd_zbc_clear_zones_updating(struct scsi_disk *sdkp,
> +                                             sector_t sector, sector_t n=
r_sects)
> +{
> +       return __sd_zbc_zones_updating(sdkp, sector, nr_sects, false);
> +}
> +
> +static void sd_zbc_start_queue(struct request_queue *q)
> +{
> +       unsigned long flags;
> +
> +       if (q->mq_ops) {
> +               blk_mq_start_hw_queues(q);
> +       } else {
> +               spin_lock_irqsave(q->queue_lock, flags);
> +               blk_start_queue(q);
> +               spin_unlock_irqrestore(q->queue_lock, flags);
> +       }
> +}
> +
> +static void sd_zbc_update_zone_work(struct work_struct *work)
> +{
> +       struct zbc_zone_work *zwork =3D
> +               container_of(work, struct zbc_zone_work, zone_work);
> +       struct scsi_disk *sdkp =3D zwork->sdkp;
> +       sector_t capacity =3D logical_to_sectors(sdkp->device, sdkp->capa=
city);
> +       struct request_queue *q =3D sdkp->disk->queue;
> +       sector_t end_sector, sector =3D zwork->sector;
> +       unsigned int bufsize;
> +       unsigned char *buf;
> +       int ret =3D -ENOMEM;
> +
> +       /* Get a buffer */
> +       if (!zwork->nr_zones) {
> +               bufsize =3D SD_ZBC_BUF_SIZE;
> +       } else {
> +               bufsize =3D (zwork->nr_zones + 1) * 64;
> +               if (bufsize < 512)
> +                       bufsize =3D 512;
> +               else if (bufsize > SD_ZBC_BUF_SIZE)
> +                               bufsize =3D SD_ZBC_BUF_SIZE;
> +               else
> +                       bufsize =3D (bufsize + 511) & ~511;
> +       }
> +       buf =3D kmalloc(bufsize, GFP_KERNEL | GFP_DMA);
> +       if (!buf) {
> +               sd_zbc_err(sdkp, "Failed to allocate zone report buffer\n=
");
> +               goto done_free;
> +       }
> +
> +       /* Process sector range */
> +       end_sector =3D zwork->sector + zwork->nr_sects;
> +       while(sector < min(end_sector, capacity)) {
> +
> +               /* Get zone report */
> +               ret =3D sd_zbc_report_zones(sdkp, buf, bufsize, sector,
> +                                         ZBC_ZONE_REPORTING_OPTION_ALL, =
true);
> +               if (ret)
> +                       break;
> +
> +               ret =3D zbc_parse_zones(sdkp, buf, bufsize, &sector);
> +               if (ret)
> +                       break;
> +
> +               /* Kick start the queue to allow requests waiting */
> +               /* for the zones just updated to run              */
> +               sd_zbc_start_queue(q);
> +
> +       }
> +
> +done_free:
> +       if (ret)
> +               sd_zbc_clear_zones_updating(sdkp, zwork->sector, zwork->n=
r_sects);
> +       if (buf)
> +               kfree(buf);
> +       kfree(zwork);
> +}
> +
> +/**
> + * sd_zbc_update_zones - Update zone information for zones starting
> + * from @start_sector. If not in init mode, the update is done only
> + * for zones marked with update flag.
> + * @sdkp: SCSI disk for which the zone information needs to be updated
> + * @start_sector: First sector of the first zone to be updated
> + * @bufsize: buffersize to be allocated for report zones
> + */
> +static int sd_zbc_update_zones(struct scsi_disk *sdkp,
> +                              sector_t sector, sector_t nr_sects,
> +                              gfp_t gfpflags, bool init)
> +{
> +       struct zbc_zone_work *zwork;
> +
> +       zwork =3D kzalloc(sizeof(struct zbc_zone_work), gfpflags);
> +       if (!zwork) {
> +               sd_zbc_err(sdkp, "Failed to allocate zone work\n");
> +               return -ENOMEM;
> +       }
> +
> +       if (!nr_sects) {
> +               /* All zones */
> +               sector =3D 0;
> +               nr_sects =3D logical_to_sectors(sdkp->device, sdkp->capac=
ity);
> +       }
> +
> +       INIT_WORK(&zwork->zone_work, sd_zbc_update_zone_work);
> +       zwork->sdkp =3D sdkp;
> +       zwork->sector =3D sector;
> +       zwork->nr_sects =3D nr_sects;
> +       zwork->init =3D init;
> +
> +       if (!init)
> +               /* Mark the zones falling in the report as updating */
> +               zwork->nr_zones =3D sd_zbc_set_zones_updating(sdkp, secto=
r, nr_sects);
> +
> +       if (init || zwork->nr_zones)
> +               queue_work(sdkp->zone_work_q, &zwork->zone_work);
> +       else
> +               kfree(zwork);
> +
> +       return 0;
> +}
> +
> +int sd_zbc_setup_report_cmnd(struct scsi_cmnd *cmd)
> +{
> +       struct request *rq =3D cmd->request;
> +       struct gendisk *disk =3D rq->rq_disk;
> +       struct scsi_disk *sdkp =3D scsi_disk(disk);
> +       int ret;
> +
> +       if (!sdkp->zone_work_q)
> +               return BLKPREP_KILL;
> +
> +       ret =3D sd_zbc_update_zones(sdkp, blk_rq_pos(rq), blk_rq_sectors(=
rq),
> +                                 GFP_ATOMIC, false);
> +       if (unlikely(ret))
> +               return BLKPREP_DEFER;
> +
> +       return BLKPREP_DONE;
> +}
> +
> +static void sd_zbc_setup_action_cmnd(struct scsi_cmnd *cmd,
> +                                    u8 action,
> +                                    bool all)
> +{
> +       struct request *rq =3D cmd->request;
> +       struct scsi_disk *sdkp =3D scsi_disk(rq->rq_disk);
> +       sector_t lba;
> +
> +       cmd->cmd_len =3D 16;
> +       cmd->cmnd[0] =3D ZBC_OUT;
> +       cmd->cmnd[1] =3D action;
> +       if (all) {
> +               cmd->cmnd[14] |=3D 0x01;
> +       } else {
> +               lba =3D sectors_to_logical(sdkp->device, blk_rq_pos(rq));
> +               put_unaligned_be64(lba, &cmd->cmnd[2]);
> +       }
> +
> +       rq->completion_data =3D NULL;
> +       rq->timeout =3D SD_TIMEOUT;
> +       rq->__data_len =3D blk_rq_bytes(rq);
> +
> +       /* Don't retry */
> +       cmd->allowed =3D 0;
> +       cmd->transfersize =3D 0;
> +       cmd->sc_data_direction =3D DMA_NONE;
> +}
> +
> +int sd_zbc_setup_reset_cmnd(struct scsi_cmnd *cmd)
> +{
> +       struct request *rq =3D cmd->request;
> +       struct scsi_disk *sdkp =3D scsi_disk(rq->rq_disk);
> +       sector_t sector =3D blk_rq_pos(rq);
> +       sector_t nr_sects =3D blk_rq_sectors(rq);
> +       struct blk_zone *zone =3D NULL;
> +       int ret =3D BLKPREP_OK;
> +
> +       if (nr_sects) {
> +               zone =3D blk_lookup_zone(rq->q, sector);
> +               if (!zone)
> +                       return BLKPREP_KILL;
> +       }
> +
> +       if (zone) {
> +
> +               blk_lock_zone(zone);
> +
> +               /* If the zone is being updated, wait */
> +               if (blk_zone_in_update(zone)) {
> +                       ret =3D BLKPREP_DEFER;
> +                       goto out;
> +               }
> +
> +               if (zone->type =3D=3D BLK_ZONE_TYPE_UNKNOWN) {
> +                       sd_zbc_debug(sdkp,
> +                                    "Discarding unknown zone %zu\n",
> +                                    zone->start);
> +                       ret =3D BLKPREP_KILL;
> +                       goto out;
> +               }
> +
> +               /* Nothing to do for conventional sequential zones */
> +               if (blk_zone_is_conv(zone)) {
> +                       ret =3D BLKPREP_DONE;
> +                       goto out;
> +               }
> +
> +               if (!blk_try_write_lock_zone(zone)) {
> +                       ret =3D BLKPREP_DEFER;
> +                       goto out;
> +               }
> +
> +               /* Nothing to do if the zone is already empty */
> +               if (blk_zone_is_empty(zone)) {
> +                       blk_write_unlock_zone(zone);
> +                       ret =3D BLKPREP_DONE;
> +                       goto out;
> +               }
> +
> +               if (sector !=3D zone->start ||
> +                   (nr_sects !=3D zone->len)) {
> +                       sd_printk(KERN_ERR, sdkp,
> +                                 "Unaligned reset wp request, start %zu/=
%zu"
> +                                 " len %zu/%zu\n",
> +                                 zone->start, sector, zone->len, nr_sect=
s);
> +                       blk_write_unlock_zone(zone);
> +                       ret =3D BLKPREP_KILL;
> +                       goto out;
> +               }
> +
> +       }
> +
> +       sd_zbc_setup_action_cmnd(cmd, ZO_RESET_WRITE_POINTER, !zone);
> +
> +out:
> +       if (zone) {
> +               if (ret =3D=3D BLKPREP_OK) {
> +                       /*
> +                        * Opportunistic update. Will be fixed up
> +                        * with zone update if the command fails,
> +                        */
> +                       zone->wp =3D zone->start;
> +                       zone->cond =3D BLK_ZONE_COND_EMPTY;
> +                       zone->reset =3D 0;
> +                       zone->non_seq =3D 0;
> +               }
> +               blk_unlock_zone(zone);
> +       }
> +
> +       return ret;
> +}
> +
> +int sd_zbc_setup_open_cmnd(struct scsi_cmnd *cmd)
> +{
> +       struct request *rq =3D cmd->request;
> +       struct scsi_disk *sdkp =3D scsi_disk(rq->rq_disk);
> +       sector_t sector =3D blk_rq_pos(rq);
> +       sector_t nr_sects =3D blk_rq_sectors(rq);
> +       struct blk_zone *zone =3D NULL;
> +       int ret =3D BLKPREP_OK;
> +
> +       if (nr_sects) {
> +               zone =3D blk_lookup_zone(rq->q, sector);
> +               if (!zone)
> +                       return BLKPREP_KILL;
> +       }
> +
> +       if (zone) {
> +
> +               blk_lock_zone(zone);
> +
> +               /* If the zone is being updated, wait */
> +               if (blk_zone_in_update(zone)) {
> +                       ret =3D BLKPREP_DEFER;
> +                       goto out;
> +               }
> +
> +               if (zone->type =3D=3D BLK_ZONE_TYPE_UNKNOWN) {
> +                       sd_zbc_debug(sdkp,
> +                                    "Opening unknown zone %zu\n",
> +                                    zone->start);
> +                       ret =3D BLKPREP_KILL;
> +                       goto out;
> +               }
> +
> +               /*
> +                * Nothing to do for conventional zones,
> +                * zones already open or full zones.
> +                */
> +               if (blk_zone_is_conv(zone) ||
> +                   blk_zone_is_open(zone) ||
> +                   blk_zone_is_full(zone)) {
> +                       ret =3D BLKPREP_DONE;
> +                       goto out;
> +               }
> +
> +               if (sector !=3D zone->start ||
> +                   (nr_sects !=3D zone->len)) {
> +                       sd_printk(KERN_ERR, sdkp,
> +                                 "Unaligned open zone request, start %zu=
/%zu"
> +                                 " len %zu/%zu\n",
> +                                 zone->start, sector, zone->len, nr_sect=
s);
> +                       ret =3D BLKPREP_KILL;
> +                       goto out;
> +               }
> +
> +       }
> +
> +       sd_zbc_setup_action_cmnd(cmd, ZO_OPEN_ZONE, !zone);
> +
> +out:
> +       if (zone) {
> +               if (ret =3D=3D BLKPREP_OK)
> +                       /*
> +                        * Opportunistic update. Will be fixed up
> +                        * with zone update if the command fails.
> +                        */
> +                       zone->cond =3D BLK_ZONE_COND_EXP_OPEN;
> +               blk_unlock_zone(zone);
> +       }
> +
> +       return ret;
> +}
> +
> +int sd_zbc_setup_close_cmnd(struct scsi_cmnd *cmd)
> +{
> +       struct request *rq =3D cmd->request;
> +       struct scsi_disk *sdkp =3D scsi_disk(rq->rq_disk);
> +       sector_t sector =3D blk_rq_pos(rq);
> +       sector_t nr_sects =3D blk_rq_sectors(rq);
> +       struct blk_zone *zone =3D NULL;
> +       int ret =3D BLKPREP_OK;
> +
> +       if (nr_sects) {
> +               zone =3D blk_lookup_zone(rq->q, sector);
> +               if (!zone)
> +                       return BLKPREP_KILL;
> +       }
> +
> +       if (zone) {
> +
> +               blk_lock_zone(zone);
> +
> +               /* If the zone is being updated, wait */
> +               if (blk_zone_in_update(zone)) {
> +                       ret =3D BLKPREP_DEFER;
> +                       goto out;
> +               }
> +
> +               if (zone->type =3D=3D BLK_ZONE_TYPE_UNKNOWN) {
> +                       sd_zbc_debug(sdkp,
> +                                    "Closing unknown zone %zu\n",
> +                                    zone->start);
> +                       ret =3D BLKPREP_KILL;
> +                       goto out;
> +               }
> +
> +               /*
> +                * Nothing to do for conventional zones,
> +                * full zones or empty zones.
> +                */
> +               if (blk_zone_is_conv(zone) ||
> +                   blk_zone_is_full(zone) ||
> +                   blk_zone_is_empty(zone)) {
> +                       ret =3D BLKPREP_DONE;
> +                       goto out;
> +               }
> +
> +               if (sector !=3D zone->start ||
> +                   (nr_sects !=3D zone->len)) {
> +                       sd_printk(KERN_ERR, sdkp,
> +                                 "Unaligned close zone request, start %z=
u/%zu"
> +                                 " len %zu/%zu\n",
> +                                 zone->start, sector, zone->len, nr_sect=
s);
> +                       ret =3D BLKPREP_KILL;
> +                       goto out;
> +               }
> +
> +       }
> +
> +       sd_zbc_setup_action_cmnd(cmd, ZO_CLOSE_ZONE, !zone);
> +
> +out:
> +       if (zone) {
> +               if (ret =3D=3D BLKPREP_OK)
> +                       /*
> +                        * Opportunistic update. Will be fixed up
> +                        * with zone update if the command fails.
> +                        */
> +                       zone->cond =3D BLK_ZONE_COND_CLOSED;
> +               blk_unlock_zone(zone);
> +       }
> +
> +       return ret;
> +}
> +
> +int sd_zbc_setup_finish_cmnd(struct scsi_cmnd *cmd)
> +{
> +       struct request *rq =3D cmd->request;
> +       struct scsi_disk *sdkp =3D scsi_disk(rq->rq_disk);
> +       sector_t sector =3D blk_rq_pos(rq);
> +       sector_t nr_sects =3D blk_rq_sectors(rq);
> +       struct blk_zone *zone =3D NULL;
> +       int ret =3D BLKPREP_OK;
> +
> +       if (nr_sects) {
> +               zone =3D blk_lookup_zone(rq->q, sector);
> +               if (!zone)
> +                       return BLKPREP_KILL;
> +       }
> +
> +       if (zone) {
> +
> +               blk_lock_zone(zone);
> +
> +               /* If the zone is being updated, wait */
> +               if (blk_zone_in_update(zone)) {
> +                       ret =3D BLKPREP_DEFER;
> +                       goto out;
> +               }
> +
> +               if (zone->type =3D=3D BLK_ZONE_TYPE_UNKNOWN) {
> +                       sd_zbc_debug(sdkp,
> +                                    "Finishing unknown zone %zu\n",
> +                                    zone->start);
> +                       ret =3D BLKPREP_KILL;
> +                       goto out;
> +               }
> +
> +               /* Nothing to do for conventional zones and full zones */
> +               if (blk_zone_is_conv(zone) ||
> +                   blk_zone_is_full(zone)) {
> +                       ret =3D BLKPREP_DONE;
> +                       goto out;
> +               }
> +
> +               if (sector !=3D zone->start ||
> +                   (nr_sects !=3D zone->len)) {
> +                       sd_printk(KERN_ERR, sdkp,
> +                                 "Unaligned finish zone request, start %=
zu/%zu"
> +                                 " len %zu/%zu\n",
> +                                 zone->start, sector, zone->len, nr_sect=
s);
> +                       ret =3D BLKPREP_KILL;
> +                       goto out;
> +               }
> +
> +       }
> +
> +       sd_zbc_setup_action_cmnd(cmd, ZO_FINISH_ZONE, !zone);
> +
> +out:
> +       if (zone) {
> +               if (ret =3D=3D BLKPREP_OK) {
> +                       /*
> +                        * Opportunistic update. Will be fixed up
> +                        * with zone update if the command fails.
> +                        */
> +                       zone->cond =3D BLK_ZONE_COND_FULL;
> +                       if (blk_zone_is_seq(zone))
> +                               zone->wp =3D zone->start + zone->len;
> +               }
> +               blk_unlock_zone(zone);
> +       }
> +
> +       return ret;
> +}
> +

Would be nice to have open/close/finish/reset share a little more code.

> +int sd_zbc_setup_read_write(struct scsi_disk *sdkp, struct request *rq,
> +                           sector_t sector, unsigned int *num_sectors)
> +{
> +       struct blk_zone *zone;
> +       unsigned int sectors =3D *num_sectors;
> +       int ret =3D BLKPREP_OK;
> +
> +       zone =3D blk_lookup_zone(rq->q, sector);
> +       if (!zone)
> +               /* Let the drive handle the request */
> +               return BLKPREP_OK;
> +
> +       blk_lock_zone(zone);
> +
> +       /* If the zone is being updated, wait */
> +       if (blk_zone_in_update(zone)) {
> +               ret =3D BLKPREP_DEFER;
> +               goto out;
> +       }
> +
> +       if (zone->type =3D=3D BLK_ZONE_TYPE_UNKNOWN) {
> +               sd_zbc_debug(sdkp,
> +                            "Unknown zone %zu\n",
> +                            zone->start);
> +               ret =3D BLKPREP_KILL;
> +               goto out;
> +       }
> +
> +       /* For offline and read-only zones, let the drive fail the comman=
d */
> +       if (blk_zone_is_offline(zone) ||
> +           blk_zone_is_readonly(zone))
> +               goto out;
> +
> +       /* Do not allow zone boundaries crossing */
> +       if (sector + sectors > zone->start + zone->len) {
> +               ret =3D BLKPREP_KILL;
> +               goto out;
> +       }
> +
> +       /* For conventional zones, no checks */
> +       if (blk_zone_is_conv(zone))
> +               goto out;
> +
> +       if (req_op(rq) =3D=3D REQ_OP_WRITE ||
> +           req_op(rq) =3D=3D REQ_OP_WRITE_SAME) {
> +
> +               /*
> +                * Write requests may change the write pointer and
> +                * transition the zone condition to full. Changes
> +                * are oportunistic here. If the request fails, a
> +                * zone update will fix the zone information.
> +                */
> +               if (blk_zone_is_seq_req(zone)) {
> +
> +                       /*
> +                        * Do not issue more than one write at a time per
> +                        * zone. This solves write ordering problems due =
to
> +                        * the unlocking of the request queue in the disp=
atch
> +                        * path in the non scsi-mq case. For scsi-mq, thi=
s
> +                        * also avoids potential write reordering when mu=
ltiple
> +                        * threads running on different CPUs write to the=
 same
> +                        * zone (with a synchronized sequential pattern).
> +                        */
> +                       if (!blk_try_write_lock_zone(zone)) {
> +                               ret =3D BLKPREP_DEFER;
> +                               goto out;
> +                       }
> +
> +                       /* For host-managed drives, writes are allowed */
> +                       /* only at the write pointer position.         */
> +                       if (zone->wp !=3D sector) {
> +                               blk_write_unlock_zone(zone);
> +                               ret =3D BLKPREP_KILL;
> +                               goto out;
> +                       }
> +
> +                       zone->wp +=3D sectors;
> +                       if (zone->wp >=3D zone->start + zone->len) {
> +                               zone->cond =3D BLK_ZONE_COND_FULL;
> +                               zone->wp =3D zone->start + zone->len;
> +                       }
> +
> +               } else {
> +
> +                       /* For host-aware drives, writes are allowed */
> +                       /* anywhere in the zone, but wp can only go  */
> +                       /* forward.                                  */
> +                       sector_t end_sector =3D sector + sectors;
> +                       if (sector =3D=3D zone->wp &&
> +                           end_sector >=3D zone->start + zone->len) {
> +                               zone->cond =3D BLK_ZONE_COND_FULL;
> +                               zone->wp =3D zone->start + zone->len;
> +                       } else if (end_sector > zone->wp) {
> +                               zone->wp =3D end_sector;
> +                       }
> +
> +               }
> +
> +       } else {
> +

If the drive does not have restricted reads
the just goto out here.

Not all HM drives will have restricted reads and
no HA drives have restricted reads.

> +               /* Check read after write pointer */
> +               if (sector + sectors <=3D zone->wp)
> +                       goto out;
> +
> +               if (zone->wp <=3D sector) {
> +                       /* Read beyond WP: clear request buffer */
> +                       struct req_iterator iter;
> +                       struct bio_vec bvec;
> +                       unsigned long flags;
> +                       void *buf;
> +                       rq_for_each_segment(bvec, rq, iter) {
> +                               buf =3D bvec_kmap_irq(&bvec, &flags);
> +                               memset(buf, 0, bvec.bv_len);
> +                               flush_dcache_page(bvec.bv_page);
> +                               bvec_kunmap_irq(buf, &flags);
> +                       }
> +                       ret =3D BLKPREP_DONE;
> +                       goto out;
> +               }
> +
> +               /* Read straddle WP position: limit request size */
> +               *num_sectors =3D zone->wp - sector;
> +
> +       }
> +
> +out:
> +       blk_unlock_zone(zone);
> +
> +       return ret;
> +}
> +
> +void sd_zbc_done(struct scsi_cmnd *cmd,
> +                struct scsi_sense_hdr *sshdr)
> +{
> +       int result =3D cmd->result;
> +       struct request *rq =3D cmd->request;
> +       struct scsi_disk *sdkp =3D scsi_disk(rq->rq_disk);
> +       struct request_queue *q =3D sdkp->disk->queue;
> +       sector_t pos =3D blk_rq_pos(rq);
> +       struct blk_zone *zone =3D NULL;
> +       bool write_unlock =3D false;
> +
> +       /*
> +        * Get the target zone of commands of interest. Some may
> +        * apply to all zones so check the request sectors first.
> +        */
> +       switch (req_op(rq)) {
> +       case REQ_OP_DISCARD:
> +       case REQ_OP_WRITE:
> +       case REQ_OP_WRITE_SAME:
> +       case REQ_OP_ZONE_RESET:
> +               write_unlock =3D true;
> +               /* fallthru */
> +       case REQ_OP_ZONE_OPEN:
> +       case REQ_OP_ZONE_CLOSE:
> +       case REQ_OP_ZONE_FINISH:
> +               if (blk_rq_sectors(rq))
> +                       zone =3D blk_lookup_zone(q, pos);
> +               break;
> +       }
> +
> +       if (zone && write_unlock)
> +           blk_write_unlock_zone(zone);
> +
> +       if (!result)
> +               return;
> +
> +       if (sshdr->sense_key =3D=3D ILLEGAL_REQUEST &&
> +           sshdr->asc =3D=3D 0x21)
> +               /*
> +                * It is unlikely that retrying requests failed with any
> +                * kind of alignement error will result in success. So do=
n't
> +                * try. Report the error back to the user quickly so that
> +                * corrective actions can be taken after obtaining update=
d
> +                * zone information.
> +                */
> +               cmd->allowed =3D 0;
> +
> +       /* On error, force an update unless this is a failed report */
> +       if (req_op(rq) =3D=3D REQ_OP_ZONE_REPORT)
> +               sd_zbc_clear_zones_updating(sdkp, pos, blk_rq_sectors(rq)=
);
> +       else if (zone)
> +               sd_zbc_update_zones(sdkp, zone->start, zone->len,
> +                                   GFP_ATOMIC, false);
> +}
> +
> +void sd_zbc_read_zones(struct scsi_disk *sdkp, char *buf)
> +{
> +       struct request_queue *q =3D sdkp->disk->queue;
> +       struct blk_zone *zone;
> +       sector_t capacity;
> +       sector_t sector;
> +       bool init =3D false;
> +       u32 rep_len;
> +       int ret =3D 0;
> +
> +       if (sdkp->zoned !=3D 1 && sdkp->device->type !=3D TYPE_ZBC)
> +               /*
> +                * Device managed or normal SCSI disk,
> +                * no special handling required
> +                */
> +               return;
> +
> +       /* Do a report zone to get the maximum LBA to check capacity */
> +       ret =3D sd_zbc_report_zones(sdkp, buf, SD_BUF_SIZE,
> +                                 0, ZBC_ZONE_REPORTING_OPTION_ALL, false=
);
> +       if (ret < 0)
> +               return;
> +
> +       rep_len =3D get_unaligned_be32(&buf[0]);
> +       if (rep_len < 64) {
> +               sd_printk(KERN_WARNING, sdkp,
> +                         "REPORT ZONES report invalid length %u\n",
> +                         rep_len);
> +               return;
> +       }
> +
> +       if (sdkp->rc_basis =3D=3D 0) {
> +               /* The max_lba field is the capacity of this device */
> +               sector_t lba =3D get_unaligned_be64(&buf[8]);
> +               if (lba + 1 > sdkp->capacity) {
> +                       if (sdkp->first_scan)
> +                               sd_printk(KERN_WARNING, sdkp,
> +                                         "Changing capacity from %zu "
> +                                         "to max LBA+1 %zu\n",
> +                                         sdkp->capacity,
> +                                         (sector_t) lba + 1);
> +                       sdkp->capacity =3D lba + 1;
> +               }
> +       }
> +
> +       /* Setup the zone work queue */
> +       if (! sdkp->zone_work_q) {
> +               sdkp->zone_work_q =3D
> +                       alloc_ordered_workqueue("zbc_wq_%s", WQ_MEM_RECLA=
IM,
> +                                               sdkp->disk->disk_name);
> +               if (!sdkp->zone_work_q) {
> +                       sdev_printk(KERN_WARNING, sdkp->device,
> +                                   "Create zoned disk workqueue failed\n=
");
> +                       return;
> +               }
> +               init =3D true;
> +       }
> +
> +       /*
> +        * Parse what we already got. If all zones are not parsed yet,
> +        * kick start an update to get the remaining.
> +        */
> +       capacity =3D logical_to_sectors(sdkp->device, sdkp->capacity);
> +       ret =3D zbc_parse_zones(sdkp, buf, SD_BUF_SIZE, &sector);
> +       if (ret =3D=3D 0 && sector < capacity) {
> +               sd_zbc_update_zones(sdkp, sector, capacity - sector,
> +                                   GFP_KERNEL, init);
> +               drain_workqueue(sdkp->zone_work_q);
> +       }
> +       if (ret)
> +               return;
> +
> +       /*
> +        * Analyze the zones layout: if all zones are the same size and
> +        * the size is a power of 2, chunk the device and map discard to
> +        * reset write pointer command. Otherwise, disable discard.
> +        */
> +       sdkp->zone_sectors =3D 0;
> +       sdkp->nr_zones =3D 0;
> +       sector =3D 0;
> +       while(sector < capacity) {
> +
> +               zone =3D blk_lookup_zone(q, sector);
> +               if (!zone) {
> +                       sdkp->zone_sectors =3D 0;
> +                       sdkp->nr_zones =3D 0;
> +                       break;
> +               }
> +
> +               sector +=3D zone->len;
> +
> +               if (sdkp->zone_sectors =3D=3D 0) {
> +                       sdkp->zone_sectors =3D zone->len;
> +               } else if (sector !=3D capacity &&
> +                        zone->len !=3D sdkp->zone_sectors) {
> +                       sdkp->zone_sectors =3D 0;
> +                       sdkp->nr_zones =3D 0;
> +                       break;
> +               }
> +
> +               sdkp->nr_zones++;
> +
> +       }
> +
> +       if (!sdkp->zone_sectors ||
> +           !is_power_of_2(sdkp->zone_sectors)) {
> +               sd_config_discard(sdkp, SD_LBP_DISABLE);
> +               if (sdkp->first_scan)
> +                       sd_printk(KERN_NOTICE, sdkp,
> +                                 "%u zones (non constant zone size)\n",
> +                                 sdkp->nr_zones);
> +               return;
> +       }
> +
> +       /* Setup discard granularity to the zone size */
> +       blk_queue_chunk_sectors(sdkp->disk->queue, sdkp->zone_sectors);
> +       sdkp->max_unmap_blocks =3D sdkp->zone_sectors;
> +       sdkp->unmap_alignment =3D sectors_to_logical(sdkp->device,
> +                                                  sdkp->zone_sectors);
> +       sdkp->unmap_granularity =3D sdkp->unmap_alignment;
> +       sd_config_discard(sdkp, SD_ZBC_RESET_WP);
> +
> +       if (sdkp->first_scan) {
> +               if (sdkp->nr_zones * sdkp->zone_sectors =3D=3D capacity)
> +                       sd_printk(KERN_NOTICE, sdkp,
> +                                 "%u zones of %zu sectors\n",
> +                                 sdkp->nr_zones,
> +                                 sdkp->zone_sectors);
> +               else
> +                       sd_printk(KERN_NOTICE, sdkp,
> +                                 "%u zones of %zu sectors "
> +                                 "+ 1 runt zone\n",
> +                                 sdkp->nr_zones - 1,
> +                                 sdkp->zone_sectors);
> +       }
> +}
> +
> +void sd_zbc_remove(struct scsi_disk *sdkp)
> +{
> +
> +       sd_config_discard(sdkp, SD_LBP_DISABLE);
> +
> +       if (sdkp->zone_work_q) {
> +               drain_workqueue(sdkp->zone_work_q);
> +               destroy_workqueue(sdkp->zone_work_q);
> +               sdkp->zone_work_q =3D NULL;
> +               blk_drop_zones(sdkp->disk->queue);
> +       }
> +}
> +
> diff --git a/include/scsi/scsi_proto.h b/include/scsi/scsi_proto.h
> index d1defd1..6ba66e0 100644
> --- a/include/scsi/scsi_proto.h
> +++ b/include/scsi/scsi_proto.h
> @@ -299,4 +299,21 @@ struct scsi_lun {
>  #define SCSI_ACCESS_STATE_MASK        0x0f
>  #define SCSI_ACCESS_STATE_PREFERRED   0x80
>
> +/* Reporting options for REPORT ZONES */
> +enum zbc_zone_reporting_options {
> +       ZBC_ZONE_REPORTING_OPTION_ALL =3D 0,
> +       ZBC_ZONE_REPORTING_OPTION_EMPTY,
> +       ZBC_ZONE_REPORTING_OPTION_IMPLICIT_OPEN,
> +       ZBC_ZONE_REPORTING_OPTION_EXPLICIT_OPEN,
> +       ZBC_ZONE_REPORTING_OPTION_CLOSED,
> +       ZBC_ZONE_REPORTING_OPTION_FULL,
> +       ZBC_ZONE_REPORTING_OPTION_READONLY,
> +       ZBC_ZONE_REPORTING_OPTION_OFFLINE,
> +       ZBC_ZONE_REPORTING_OPTION_NEED_RESET_WP =3D 0x10,
> +       ZBC_ZONE_REPORTING_OPTION_NON_SEQWRITE,
> +       ZBC_ZONE_REPORTING_OPTION_NON_WP =3D 0x3f,
> +};
> +
> +#define ZBC_REPORT_ZONE_PARTIAL 0x80
> +

Why don't we expose these enums via uapi?


>  #endif /* _SCSI_PROTO_H_ */
> --
> 2.7.4
>
> Western Digital Corporation (and its subsidiaries) E-mail Confidentiality=
 Notice & Disclaimer:
>
> This e-mail and any files transmitted with it may contain confidential or=
 legally privileged information of WDC and/or its affiliates, and are inten=
ded solely for the use of the individual or entity to which they are addres=
sed. If you are not the intended recipient, any disclosure, copying, distri=
bution or any action taken or omitted to be taken in reliance on it, is pro=
hibited. If you have received this e-mail in error, please notify the sende=
r immediately and delete the e-mail in its entirety from your system.
>



--=20
Shaun Tancheff

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 8/9] sd: Implement support for ZBC devices
@ 2016-09-20  5:40     ` Shaun Tancheff
  0 siblings, 0 replies; 36+ messages in thread
From: Shaun Tancheff @ 2016-09-20  5:40 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: linux-scsi, linux-block, Martin K. Petersen, Jens Axboe,
	Hannes Reinecke, Hannes Reinecke

On Mon, Sep 19, 2016 at 4:27 PM, Damien Le Moal <damien.lemoal@hgst.com> wrote:
> From: Hannes Reinecke <hare@suse.com>
>
> Implement ZBC support functions to setup zoned disks and fill the
> block device zone information tree during the device scan. The
> zone information tree is also always updated on disk revalidation.
> This adds support for the REQ_OP_ZONE* operations and also implements
> the new RESET_WP provisioning mode so that discard requests can be
> mapped to the RESET WRITE POINTER command for devices with a constant
> zone size.
>
> The capacity read of the device triggers the zone information read
> for zoned block devices. As this needs the device zone model, the
> the call to sd_read_capacity is moved after the call to
> sd_read_block_characteristics so that host-aware devices are
> properlly initialized. The call to sd_zbc_read_zones in
> sd_read_capacity may change the device capacity obtained with
> the sd_read_capacity_16 function for devices reporting only the
> capacity of conventional zones at the beginning of the LBA range
> (i.e. devices with rc_basis et to 0).
>
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
> ---
>  drivers/scsi/Makefile     |    1 +
>  drivers/scsi/sd.c         |  147 ++++--
>  drivers/scsi/sd.h         |   68 +++
>  drivers/scsi/sd_zbc.c     | 1097 +++++++++++++++++++++++++++++++++++++++++++++
>  include/scsi/scsi_proto.h |   17 +
>  5 files changed, 1304 insertions(+), 26 deletions(-)
>  create mode 100644 drivers/scsi/sd_zbc.c
>
> diff --git a/drivers/scsi/Makefile b/drivers/scsi/Makefile
> index d539798..fabcb6d 100644
> --- a/drivers/scsi/Makefile
> +++ b/drivers/scsi/Makefile
> @@ -179,6 +179,7 @@ hv_storvsc-y                        := storvsc_drv.o
>
>  sd_mod-objs    := sd.o
>  sd_mod-$(CONFIG_BLK_DEV_INTEGRITY) += sd_dif.o
> +sd_mod-$(CONFIG_BLK_DEV_ZONED) += sd_zbc.o
>
>  sr_mod-objs    := sr.o sr_ioctl.o sr_vendor.o
>  ncr53c8xx-flags-$(CONFIG_SCSI_ZALON) \
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index d3e852a..46b8b78 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -92,6 +92,7 @@ MODULE_ALIAS_BLOCKDEV_MAJOR(SCSI_DISK15_MAJOR);
>  MODULE_ALIAS_SCSI_DEVICE(TYPE_DISK);
>  MODULE_ALIAS_SCSI_DEVICE(TYPE_MOD);
>  MODULE_ALIAS_SCSI_DEVICE(TYPE_RBC);
> +MODULE_ALIAS_SCSI_DEVICE(TYPE_ZBC);
>
>  #if !defined(CONFIG_DEBUG_BLOCK_EXT_DEVT)
>  #define SD_MINORS      16
> @@ -99,7 +100,6 @@ MODULE_ALIAS_SCSI_DEVICE(TYPE_RBC);
>  #define SD_MINORS      0
>  #endif
>
> -static void sd_config_discard(struct scsi_disk *, unsigned int);
>  static void sd_config_write_same(struct scsi_disk *);
>  static int  sd_revalidate_disk(struct gendisk *);
>  static void sd_unlock_native_capacity(struct gendisk *disk);
> @@ -162,7 +162,7 @@ cache_type_store(struct device *dev, struct device_attribute *attr,
>         static const char temp[] = "temporary ";
>         int len;
>
> -       if (sdp->type != TYPE_DISK)
> +       if (sdp->type != TYPE_DISK && sdp->type != TYPE_ZBC)
>                 /* no cache control on RBC devices; theoretically they
>                  * can do it, but there's probably so many exceptions
>                  * it's not worth the risk */
> @@ -261,7 +261,7 @@ allow_restart_store(struct device *dev, struct device_attribute *attr,
>         if (!capable(CAP_SYS_ADMIN))
>                 return -EACCES;
>
> -       if (sdp->type != TYPE_DISK)
> +       if (sdp->type != TYPE_DISK && sdp->type != TYPE_ZBC)
>                 return -EINVAL;
>
>         sdp->allow_restart = simple_strtoul(buf, NULL, 10);
> @@ -369,6 +369,7 @@ static const char *lbp_mode[] = {
>         [SD_LBP_WS16]           = "writesame_16",
>         [SD_LBP_WS10]           = "writesame_10",
>         [SD_LBP_ZERO]           = "writesame_zero",
> +       [SD_ZBC_RESET_WP]       = "reset_wp",
>         [SD_LBP_DISABLE]        = "disabled",
>  };
>
> @@ -391,6 +392,13 @@ provisioning_mode_store(struct device *dev, struct device_attribute *attr,
>         if (!capable(CAP_SYS_ADMIN))
>                 return -EACCES;
>
> +       if (sdkp->zoned == 1 || sdp->type == TYPE_ZBC) {
> +               if (!strncmp(buf, lbp_mode[SD_ZBC_RESET_WP], 20)) {
> +                       sd_config_discard(sdkp, SD_ZBC_RESET_WP);
> +                       return count;
> +               }
> +               return -EINVAL;
> +       }
>         if (sdp->type != TYPE_DISK)
>                 return -EINVAL;
>
> @@ -458,7 +466,7 @@ max_write_same_blocks_store(struct device *dev, struct device_attribute *attr,
>         if (!capable(CAP_SYS_ADMIN))
>                 return -EACCES;
>
> -       if (sdp->type != TYPE_DISK)
> +       if (sdp->type != TYPE_DISK && sdp->type != TYPE_ZBC)
>                 return -EINVAL;
>
>         err = kstrtoul(buf, 10, &max);
> @@ -631,7 +639,7 @@ static unsigned char sd_setup_protect_cmnd(struct scsi_cmnd *scmd,
>         return protect;
>  }
>
> -static void sd_config_discard(struct scsi_disk *sdkp, unsigned int mode)
> +void sd_config_discard(struct scsi_disk *sdkp, unsigned int mode)
>  {
>         struct request_queue *q = sdkp->disk->queue;
>         unsigned int logical_block_size = sdkp->device->sector_size;
> @@ -683,6 +691,11 @@ static void sd_config_discard(struct scsi_disk *sdkp, unsigned int mode)
>                 q->limits.discard_zeroes_data = sdkp->lbprz;
>                 break;
>
> +       case SD_ZBC_RESET_WP:
> +               max_blocks = min_not_zero(sdkp->max_unmap_blocks,
> +                                         (u32)SD_MAX_WS16_BLOCKS);
> +               break;
> +
>         case SD_LBP_ZERO:
>                 max_blocks = min_not_zero(sdkp->max_ws_blocks,
>                                           (u32)SD_MAX_WS10_BLOCKS);
> @@ -711,16 +724,20 @@ static int sd_setup_discard_cmnd(struct scsi_cmnd *cmd)
>         unsigned int nr_sectors = blk_rq_sectors(rq);
>         unsigned int nr_bytes = blk_rq_bytes(rq);
>         unsigned int len;
> -       int ret;
> +       int ret = BLKPREP_OK;
>         char *buf;
> -       struct page *page;
> +       struct page *page = NULL;
>
>         sector >>= ilog2(sdp->sector_size) - 9;
>         nr_sectors >>= ilog2(sdp->sector_size) - 9;
>
> -       page = alloc_page(GFP_ATOMIC | __GFP_ZERO);
> -       if (!page)
> -               return BLKPREP_DEFER;
> +       if (sdkp->provisioning_mode != SD_ZBC_RESET_WP) {
> +               page = alloc_page(GFP_ATOMIC | __GFP_ZERO);
> +               if (!page)
> +                       return BLKPREP_DEFER;
> +       }
> +
> +       rq->completion_data = page;
>
>         switch (sdkp->provisioning_mode) {
>         case SD_LBP_UNMAP:
> @@ -760,12 +777,19 @@ static int sd_setup_discard_cmnd(struct scsi_cmnd *cmd)
>                 len = sdkp->device->sector_size;
>                 break;
>
> +       case SD_ZBC_RESET_WP:
> +               ret = sd_zbc_setup_reset_cmnd(cmd);
> +               if (ret != BLKPREP_OK)
> +                       goto out;
> +               /* Reset Write Pointer doesn't have a payload */
> +               len = 0;
> +               break;
> +
>         default:
>                 ret = BLKPREP_INVALID;
>                 goto out;
>         }
>
> -       rq->completion_data = page;
>         rq->timeout = SD_TIMEOUT;
>
>         cmd->transfersize = len;
> @@ -779,13 +803,17 @@ static int sd_setup_discard_cmnd(struct scsi_cmnd *cmd)
>          * discarded on disk. This allows us to report completion on the full
>          * amount of blocks described by the request.
>          */
> -       blk_add_request_payload(rq, page, 0, len);
> -       ret = scsi_init_io(cmd);
> +       if (len) {
> +               blk_add_request_payload(rq, page, 0, len);
> +               ret = scsi_init_io(cmd);
> +       }
>         rq->__data_len = nr_bytes;
>
>  out:
> -       if (ret != BLKPREP_OK)
> +       if (page && ret != BLKPREP_OK) {
> +               rq->completion_data = NULL;
>                 __free_page(page);
> +       }
>         return ret;
>  }
>
> @@ -843,6 +871,13 @@ static int sd_setup_write_same_cmnd(struct scsi_cmnd *cmd)
>
>         BUG_ON(bio_offset(bio) || bio_iovec(bio).bv_len != sdp->sector_size);
>
> +       if (sdkp->zoned == 1 || sdp->type == TYPE_ZBC) {
> +               /* sd_zbc_setup_read_write uses block layer sector units */
> +               ret = sd_zbc_setup_read_write(sdkp, rq, sector, &nr_sectors);
> +               if (ret != BLKPREP_OK)
> +                       return ret;
> +       }
> +
>         sector >>= ilog2(sdp->sector_size) - 9;
>         nr_sectors >>= ilog2(sdp->sector_size) - 9;
>
> @@ -962,6 +997,13 @@ static int sd_setup_read_write_cmnd(struct scsi_cmnd *SCpnt)
>         SCSI_LOG_HLQUEUE(2, scmd_printk(KERN_INFO, SCpnt, "block=%llu\n",
>                                         (unsigned long long)block));
>
> +       if (sdkp->zoned == 1 || sdp->type == TYPE_ZBC) {
> +               /* sd_zbc_setup_read_write uses block layer sector units */
> +               ret = sd_zbc_setup_read_write(sdkp, rq, block, &this_count);
> +               if (ret != BLKPREP_OK)
> +                       goto out;
> +       }
> +
>         /*
>          * If we have a 1K hardware sectorsize, prevent access to single
>          * 512 byte sectors.  In theory we could handle this - in fact
> @@ -1148,6 +1190,16 @@ static int sd_init_command(struct scsi_cmnd *cmd)
>         case REQ_OP_READ:
>         case REQ_OP_WRITE:
>                 return sd_setup_read_write_cmnd(cmd);
> +       case REQ_OP_ZONE_REPORT:
> +               return sd_zbc_setup_report_cmnd(cmd);
> +       case REQ_OP_ZONE_RESET:
> +               return sd_zbc_setup_reset_cmnd(cmd);
> +       case REQ_OP_ZONE_OPEN:
> +               return sd_zbc_setup_open_cmnd(cmd);
> +       case REQ_OP_ZONE_CLOSE:
> +               return sd_zbc_setup_close_cmnd(cmd);
> +       case REQ_OP_ZONE_FINISH:
> +               return sd_zbc_setup_finish_cmnd(cmd);
>         default:
>                 BUG();
>         }
> @@ -1157,7 +1209,8 @@ static void sd_uninit_command(struct scsi_cmnd *SCpnt)
>  {
>         struct request *rq = SCpnt->request;
>
> -       if (req_op(rq) == REQ_OP_DISCARD)
> +       if (req_op(rq) == REQ_OP_DISCARD &&
> +           rq->completion_data)
>                 __free_page(rq->completion_data);
>
>         if (SCpnt->cmnd != rq->cmd) {
> @@ -1778,8 +1831,16 @@ static int sd_done(struct scsi_cmnd *SCpnt)
>         int sense_deferred = 0;
>         unsigned char op = SCpnt->cmnd[0];
>         unsigned char unmap = SCpnt->cmnd[1] & 8;
> +       unsigned char sa = SCpnt->cmnd[1] & 0xf;
>
> -       if (req_op(req) == REQ_OP_DISCARD || req_op(req) == REQ_OP_WRITE_SAME) {
> +       switch(req_op(req)) {
> +       case REQ_OP_DISCARD:
> +       case REQ_OP_WRITE_SAME:
> +       case REQ_OP_ZONE_REPORT:
> +       case REQ_OP_ZONE_RESET:
> +       case REQ_OP_ZONE_OPEN:
> +       case REQ_OP_ZONE_CLOSE:
> +       case REQ_OP_ZONE_FINISH:
>                 if (!result) {
>                         good_bytes = blk_rq_bytes(req);
>                         scsi_set_resid(SCpnt, 0);
> @@ -1787,6 +1848,7 @@ static int sd_done(struct scsi_cmnd *SCpnt)
>                         good_bytes = 0;
>                         scsi_set_resid(SCpnt, blk_rq_bytes(req));
>                 }
> +               break;
>         }
>
>         if (result) {
> @@ -1829,6 +1891,10 @@ static int sd_done(struct scsi_cmnd *SCpnt)
>                         case UNMAP:
>                                 sd_config_discard(sdkp, SD_LBP_DISABLE);
>                                 break;
> +                       case ZBC_OUT:
> +                               if (sa == ZO_RESET_WRITE_POINTER)
> +                                       sd_config_discard(sdkp, SD_LBP_DISABLE);
> +                               break;
>                         case WRITE_SAME_16:
>                         case WRITE_SAME:
>                                 if (unmap)
> @@ -1847,7 +1913,11 @@ static int sd_done(struct scsi_cmnd *SCpnt)
>         default:
>                 break;
>         }
> +
>   out:
> +       if (sdkp->zoned == 1 || sdkp->device->type == TYPE_ZBC)
> +               sd_zbc_done(SCpnt, &sshdr);
> +
>         SCSI_LOG_HLCOMPLETE(1, scmd_printk(KERN_INFO, SCpnt,
>                                            "sd_done: completed %d of %d bytes\n",
>                                            good_bytes, scsi_bufflen(SCpnt)));
> @@ -1982,7 +2052,6 @@ sd_spinup_disk(struct scsi_disk *sdkp)
>         }
>  }
>
> -
>  /*
>   * Determine whether disk supports Data Integrity Field.
>   */
> @@ -2132,6 +2201,9 @@ static int read_capacity_16(struct scsi_disk *sdkp, struct scsi_device *sdp,
>         /* Logical blocks per physical block exponent */
>         sdkp->physical_block_size = (1 << (buffer[13] & 0xf)) * sector_size;
>
> +       /* RC basis */
> +       sdkp->rc_basis = (buffer[12] >> 4) & 0x3;
> +
>         /* Lowest aligned logical block */
>         alignment = ((buffer[14] & 0x3f) << 8 | buffer[15]) * sector_size;
>         blk_queue_alignment_offset(sdp->request_queue, alignment);
> @@ -2322,6 +2394,11 @@ got_data:
>                 sector_size = 512;
>         }
>         blk_queue_logical_block_size(sdp->request_queue, sector_size);
> +       blk_queue_physical_block_size(sdp->request_queue,
> +                                     sdkp->physical_block_size);
> +       sdkp->device->sector_size = sector_size;
> +
> +       sd_zbc_read_zones(sdkp, buffer);
>
>         {
>                 char cap_str_2[10], cap_str_10[10];
> @@ -2348,9 +2425,6 @@ got_data:
>         if (sdkp->capacity > 0xffffffff)
>                 sdp->use_16_for_rw = 1;
>
> -       blk_queue_physical_block_size(sdp->request_queue,
> -                                     sdkp->physical_block_size);
> -       sdkp->device->sector_size = sector_size;
>  }
>
>  /* called with buffer of length 512 */
> @@ -2612,7 +2686,7 @@ static void sd_read_app_tag_own(struct scsi_disk *sdkp, unsigned char *buffer)
>         struct scsi_mode_data data;
>         struct scsi_sense_hdr sshdr;
>
> -       if (sdp->type != TYPE_DISK)
> +       if (sdp->type != TYPE_DISK && sdp->type != TYPE_ZBC)
>                 return;
>
>         if (sdkp->protection_type == 0)
> @@ -2719,6 +2793,7 @@ static void sd_read_block_limits(struct scsi_disk *sdkp)
>   */
>  static void sd_read_block_characteristics(struct scsi_disk *sdkp)
>  {
> +       struct request_queue *q = sdkp->disk->queue;
>         unsigned char *buffer;
>         u16 rot;
>         const int vpd_len = 64;
> @@ -2733,10 +2808,21 @@ static void sd_read_block_characteristics(struct scsi_disk *sdkp)
>         rot = get_unaligned_be16(&buffer[4]);
>
>         if (rot == 1) {
> -               queue_flag_set_unlocked(QUEUE_FLAG_NONROT, sdkp->disk->queue);
> -               queue_flag_clear_unlocked(QUEUE_FLAG_ADD_RANDOM, sdkp->disk->queue);
> +               queue_flag_set_unlocked(QUEUE_FLAG_NONROT, q);
> +               queue_flag_clear_unlocked(QUEUE_FLAG_ADD_RANDOM, q);
>         }
>
> +       sdkp->zoned = (buffer[8] >> 4) & 3;
> +       if (sdkp->zoned == 1)
> +               q->limits.zoned = BLK_ZONED_HA;
> +       else if (sdkp->device->type == TYPE_ZBC)
> +               q->limits.zoned = BLK_ZONED_HM;
> +       else
> +               q->limits.zoned = BLK_ZONED_NONE;
> +       if (blk_queue_zoned(q) && sdkp->first_scan)
> +               sd_printk(KERN_NOTICE, sdkp, "Host-%s zoned block device\n",
> +                         q->limits.zoned == BLK_ZONED_HM ? "managed" : "aware");
> +
>   out:
>         kfree(buffer);
>  }
> @@ -2835,14 +2921,14 @@ static int sd_revalidate_disk(struct gendisk *disk)
>          * react badly if we do.
>          */
>         if (sdkp->media_present) {
> -               sd_read_capacity(sdkp, buffer);
> -
>                 if (scsi_device_supports_vpd(sdp)) {
>                         sd_read_block_provisioning(sdkp);
>                         sd_read_block_limits(sdkp);
>                         sd_read_block_characteristics(sdkp);
>                 }
>
> +               sd_read_capacity(sdkp, buffer);
> +
>                 sd_read_write_protect_flag(sdkp, buffer);
>                 sd_read_cache_type(sdkp, buffer);
>                 sd_read_app_tag_own(sdkp, buffer);
> @@ -3040,9 +3126,16 @@ static int sd_probe(struct device *dev)
>
>         scsi_autopm_get_device(sdp);
>         error = -ENODEV;
> -       if (sdp->type != TYPE_DISK && sdp->type != TYPE_MOD && sdp->type != TYPE_RBC)
> +       if (sdp->type != TYPE_DISK &&
> +           sdp->type != TYPE_ZBC &&
> +           sdp->type != TYPE_MOD &&
> +           sdp->type != TYPE_RBC)
>                 goto out;
>
> +#ifndef CONFIG_BLK_DEV_ZONED
> +       if (sdp->type == TYPE_ZBC)
> +               goto out;
> +#endif
>         SCSI_LOG_HLQUEUE(3, sdev_printk(KERN_INFO, sdp,
>                                         "sd_probe\n"));
>
> @@ -3146,6 +3239,8 @@ static int sd_remove(struct device *dev)
>         del_gendisk(sdkp->disk);
>         sd_shutdown(dev);
>
> +       sd_zbc_remove(sdkp);
> +
>         blk_register_region(devt, SD_MINORS, NULL,
>                             sd_default_probe, NULL, NULL);
>
> diff --git a/drivers/scsi/sd.h b/drivers/scsi/sd.h
> index 765a6f1..3452871 100644
> --- a/drivers/scsi/sd.h
> +++ b/drivers/scsi/sd.h
> @@ -56,6 +56,7 @@ enum {
>         SD_LBP_WS16,            /* Use WRITE SAME(16) with UNMAP bit */
>         SD_LBP_WS10,            /* Use WRITE SAME(10) with UNMAP bit */
>         SD_LBP_ZERO,            /* Use WRITE SAME(10) with zero payload */
> +       SD_ZBC_RESET_WP,        /* Use RESET WRITE POINTER */
>         SD_LBP_DISABLE,         /* Discard disabled due to failed cmd */
>  };
>

Can we have adding SD_ZBC_RESET_WP as a separate patch?


> @@ -64,6 +65,11 @@ struct scsi_disk {
>         struct scsi_device *device;
>         struct device   dev;
>         struct gendisk  *disk;
> +#ifdef CONFIG_BLK_DEV_ZONED
> +       struct workqueue_struct *zone_work_q;
> +       sector_t zone_sectors;
> +       unsigned int nr_zones;
> +#endif
>         atomic_t        openers;
>         sector_t        capacity;       /* size in logical blocks */
>         u32             max_xfer_blocks;
> @@ -94,6 +100,8 @@ struct scsi_disk {
>         unsigned        lbpvpd : 1;
>         unsigned        ws10 : 1;
>         unsigned        ws16 : 1;
> +       unsigned        rc_basis: 2;
> +       unsigned        zoned: 2;
>  };
>  #define to_scsi_disk(obj) container_of(obj,struct scsi_disk,dev)
>
> @@ -156,6 +164,13 @@ static inline unsigned int logical_to_bytes(struct scsi_device *sdev, sector_t b
>         return blocks * sdev->sector_size;
>  }
>
> +static inline sector_t sectors_to_logical(struct scsi_device *sdev, sector_t sector)
> +{
> +       return sector >> (ilog2(sdev->sector_size) - 9);
> +}
> +
> +extern void sd_config_discard(struct scsi_disk *, unsigned int);
> +
>  /*
>   * A DIF-capable target device can be formatted with different
>   * protection schemes.  Currently 0 through 3 are defined:
> @@ -269,4 +284,57 @@ static inline void sd_dif_complete(struct scsi_cmnd *cmd, unsigned int a)
>
>  #endif /* CONFIG_BLK_DEV_INTEGRITY */
>
> +#ifdef CONFIG_BLK_DEV_ZONED
> +
> +extern void sd_zbc_read_zones(struct scsi_disk *, char *);
> +extern void sd_zbc_remove(struct scsi_disk *);
> +extern int sd_zbc_setup_read_write(struct scsi_disk *, struct request *,
> +                                  sector_t, unsigned int *);
> +extern int sd_zbc_setup_report_cmnd(struct scsi_cmnd *);
> +extern int sd_zbc_setup_reset_cmnd(struct scsi_cmnd *);
> +extern int sd_zbc_setup_open_cmnd(struct scsi_cmnd *);
> +extern int sd_zbc_setup_close_cmnd(struct scsi_cmnd *);
> +extern int sd_zbc_setup_finish_cmnd(struct scsi_cmnd *);
> +extern void sd_zbc_done(struct scsi_cmnd *, struct scsi_sense_hdr *);
> +
> +#else /* CONFIG_BLK_DEV_ZONED */
> +
> +static inline void sd_zbc_read_zones(struct scsi_disk *sdkp,
> +                                    unsigned char *buf) {}
> +static inline void sd_zbc_remove(struct scsi_disk *sdkp) {}
> +
> +static inline int sd_zbc_setup_read_write(struct scsi_disk *sdkp,
> +                                         struct request *rq, sector_t sector,
> +                                         unsigned int *num_sectors)
> +{
> +       /* Let the drive fail requests */
> +       return BLKPREP_OK;
> +}
> +
> +static inline int sd_zbc_setup_report_cmnd(struct scsi_cmnd *cmd)
> +{
> +       return BLKPREP_KILL;
> +}
> +static inline int sd_zbc_setup_reset_cmnd(struct scsi_cmnd *cmd)
> +{
> +       return BLKPREP_KILL;
> +}
> +static inline int sd_zbc_setup_open_cmnd(struct scsi_cmnd *cmd)
> +{
> +       return BLKPREP_KILL;
> +}
> +static inline int sd_zbc_setup_close_cmnd(struct scsi_cmnd *cmd)
> +{
> +       return BLKPREP_KILL;
> +}
> +static inline int sd_zbc_setup_finish_cmnd(struct scsi_cmnd *cmd)
> +{
> +       return BLKPREP_KILL;
> +}
> +
> +static inline void sd_zbc_done(struct scsi_cmnd *cmd,
> +                              struct scsi_sense_hdr *sshdr) {}
> +
> +#endif /* CONFIG_BLK_DEV_ZONED */
> +
>  #endif /* _SCSI_DISK_H */
> diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
> new file mode 100644
> index 0000000..ec9c3fc
> --- /dev/null
> +++ b/drivers/scsi/sd_zbc.c
> @@ -0,0 +1,1097 @@
> +/*
> + * SCSI Zoned Block commands
> + *
> + * Copyright (C) 2014-2015 SUSE Linux GmbH
> + * Written by: Hannes Reinecke <hare@suse.de>
> + * Modified by: Damien Le Moal <damien.lemoal@hgst.com>
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License version
> + * 2 as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful, but
> + * WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; see the file COPYING.  If not, write to
> + * the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139,
> + * USA.
> + *
> + */
> +
> +#include <linux/blkdev.h>
> +#include <linux/rbtree.h>
> +
> +#include <asm/unaligned.h>
> +
> +#include <scsi/scsi.h>
> +#include <scsi/scsi_cmnd.h>
> +#include <scsi/scsi_dbg.h>
> +#include <scsi/scsi_device.h>
> +#include <scsi/scsi_driver.h>
> +#include <scsi/scsi_host.h>
> +#include <scsi/scsi_eh.h>
> +
> +#include "sd.h"
> +#include "scsi_priv.h"
> +
> +enum zbc_zone_type {
> +       ZBC_ZONE_TYPE_CONV = 0x1,
> +       ZBC_ZONE_TYPE_SEQWRITE_REQ,
> +       ZBC_ZONE_TYPE_SEQWRITE_PREF,
> +       ZBC_ZONE_TYPE_RESERVED,
> +};
> +
> +enum zbc_zone_cond {
> +       ZBC_ZONE_COND_NO_WP,
> +       ZBC_ZONE_COND_EMPTY,
> +       ZBC_ZONE_COND_IMP_OPEN,
> +       ZBC_ZONE_COND_EXP_OPEN,
> +       ZBC_ZONE_COND_CLOSED,
> +       ZBC_ZONE_COND_READONLY = 0xd,
> +       ZBC_ZONE_COND_FULL,
> +       ZBC_ZONE_COND_OFFLINE,
> +};
> +
> +#define SD_ZBC_BUF_SIZE 131072
> +
> +#define sd_zbc_debug(sdkp, fmt, args...)                       \
> +       pr_debug("%s %s [%s]: " fmt,                            \
> +                dev_driver_string(&(sdkp)->device->sdev_gendev), \
> +                dev_name(&(sdkp)->device->sdev_gendev),         \
> +                (sdkp)->disk->disk_name, ## args)
> +
> +#define sd_zbc_debug_ratelimit(sdkp, fmt, args...)             \
> +       do {                                                    \
> +               if (printk_ratelimit())                         \
> +                       sd_zbc_debug(sdkp, fmt, ## args);       \
> +       } while( 0 )
> +
> +#define sd_zbc_err(sdkp, fmt, args...)                         \
> +       pr_err("%s %s [%s]: " fmt,                              \
> +              dev_driver_string(&(sdkp)->device->sdev_gendev), \
> +              dev_name(&(sdkp)->device->sdev_gendev),          \
> +              (sdkp)->disk->disk_name, ## args)
> +
> +struct zbc_zone_work {
> +       struct work_struct      zone_work;
> +       struct scsi_disk        *sdkp;
> +       sector_t                sector;
> +       sector_t                nr_sects;
> +       bool                    init;
> +       unsigned int            nr_zones;
> +};
> +
> +struct blk_zone *zbc_desc_to_zone(struct scsi_disk *sdkp, unsigned char *rec)
> +{
> +       struct blk_zone *zone;
> +
> +       zone = kzalloc(sizeof(struct blk_zone), GFP_KERNEL);
> +       if (!zone)
> +               return NULL;
> +
> +       /* Zone type */
> +       switch(rec[0] & 0x0f) {
> +       case ZBC_ZONE_TYPE_CONV:
> +       case ZBC_ZONE_TYPE_SEQWRITE_REQ:
> +       case ZBC_ZONE_TYPE_SEQWRITE_PREF:
> +               zone->type = rec[0] & 0x0f;
> +               break;
> +       default:
> +               zone->type = BLK_ZONE_TYPE_UNKNOWN;
> +               break;
> +       }
> +
> +       /* Zone condition */
> +       zone->cond = (rec[1] >> 4) & 0xf;
> +       if (rec[1] & 0x01)
> +               zone->reset = 1;
> +       if (rec[1] & 0x02)
> +               zone->non_seq = 1;
> +
> +       /* Zone start sector and length */
> +       zone->len = logical_to_sectors(sdkp->device,
> +                                      get_unaligned_be64(&rec[8]));
> +       zone->start = logical_to_sectors(sdkp->device,
> +                                        get_unaligned_be64(&rec[16]));
> +
> +       /* Zone write pointer */
> +       if (blk_zone_is_empty(zone) &&
> +           zone->wp != zone->start)
> +               zone->wp = zone->start;
> +       else if (blk_zone_is_full(zone))
> +               zone->wp = zone->start + zone->len;
> +       else if (blk_zone_is_seq(zone))
> +               zone->wp = logical_to_sectors(sdkp->device,
> +                                             get_unaligned_be64(&rec[24]));
> +       else
> +               zone->wp = (sector_t)-1;
> +
> +       return zone;
> +}
> +
> +static int zbc_parse_zones(struct scsi_disk *sdkp, unsigned char *buf,
> +                          unsigned int buf_len, sector_t *next_sector)
> +{
> +       struct request_queue *q = sdkp->disk->queue;
> +       sector_t capacity = logical_to_sectors(sdkp->device, sdkp->capacity);
> +       unsigned char *rec = buf;
> +       unsigned int zone_len, list_length;
> +
> +       /* Parse REPORT ZONES header */
> +       list_length = get_unaligned_be32(&buf[0]);
> +       rec = buf + 64;
> +       list_length += 64;
> +
> +       if (list_length < buf_len)
> +               buf_len = list_length;
> +
> +       /* Parse REPORT ZONES zone descriptors */
> +       *next_sector = capacity;
> +       while (rec < buf + buf_len) {
> +
> +               struct blk_zone *new, *old;
> +
> +               new = zbc_desc_to_zone(sdkp, rec);
> +               if (!new)
> +                       return -ENOMEM;
> +
> +               zone_len = new->len;
> +               *next_sector = new->start + zone_len;
> +
> +               old = blk_insert_zone(q, new);
> +               if (old) {
> +                       blk_lock_zone(old);
> +
> +                       /*
> +                        * Always update the zone state flags and the zone
> +                        * offline and read-only condition as the drive may
> +                        * change those independently of the commands being
> +                        * executed
> +                        */
> +                       old->reset = new->reset;
> +                       old->non_seq = new->non_seq;
> +                       if (blk_zone_is_offline(new) ||
> +                           blk_zone_is_readonly(new))
> +                               old->cond = new->cond;
> +
> +                       if (blk_zone_in_update(old)) {
> +                               old->cond = new->cond;
> +                               old->wp = new->wp;
> +                               blk_clear_zone_update(old);
> +                       }
> +
> +                       blk_unlock_zone(old);
> +
> +                       kfree(new);
> +               }
> +
> +               rec += 64;
> +
> +       }
> +
> +       return 0;
> +}
> +
> +/**
> + * sd_zbc_report_zones - Issue a REPORT ZONES scsi command
> + * @sdkp: SCSI disk to which the command should be send
> + * @buffer: response buffer
> + * @bufflen: length of @buffer
> + * @start_sector: logical sector for the zone information should be reported
> + * @option: reporting option to be used
> + * @partial: flag to set the 'partial' bit for report zones command
> + */
> +int sd_zbc_report_zones(struct scsi_disk *sdkp, unsigned char *buffer,
> +                       int bufflen, sector_t start_sector,
> +                       enum zbc_zone_reporting_options option, bool partial)
> +{
> +       struct scsi_device *sdp = sdkp->device;
> +       const int timeout = sdp->request_queue->rq_timeout;
> +       struct scsi_sense_hdr sshdr;
> +       sector_t start_lba = sectors_to_logical(sdkp->device, start_sector);
> +       unsigned char cmd[16];
> +       int result;
> +
> +       if (!scsi_device_online(sdp))
> +               return -ENODEV;
> +
> +       sd_zbc_debug(sdkp, "REPORT ZONES lba %zu len %d\n",
> +                    start_lba, bufflen);
> +
> +       memset(cmd, 0, 16);
> +       cmd[0] = ZBC_IN;
> +       cmd[1] = ZI_REPORT_ZONES;
> +       put_unaligned_be64(start_lba, &cmd[2]);
> +       put_unaligned_be32(bufflen, &cmd[10]);
> +       cmd[14] = (partial ? ZBC_REPORT_ZONE_PARTIAL : 0) | option;
> +       memset(buffer, 0, bufflen);
> +
> +       result = scsi_execute_req(sdp, cmd, DMA_FROM_DEVICE,
> +                               buffer, bufflen, &sshdr,
> +                               timeout, SD_MAX_RETRIES, NULL);
> +
> +       if (result) {
> +               sd_zbc_err(sdkp,
> +                          "REPORT ZONES lba %zu failed with %d/%d\n",
> +                          start_lba, host_byte(result), driver_byte(result));
> +               return -EIO;
> +       }
> +
> +       return 0;
> +}
> +
> +/**
> + * Set or clear the update flag of all zones contained
> + * in the range sector..sector+nr_sects.
> + * Return the number of zones marked/cleared.
> + */
> +static int __sd_zbc_zones_updating(struct scsi_disk *sdkp,
> +                                  sector_t sector, sector_t nr_sects,
> +                                  bool set)
> +{
> +       struct request_queue *q = sdkp->disk->queue;
> +       struct blk_zone *zone;
> +       struct rb_node *node;
> +       unsigned long flags;
> +       int nr_zones = 0;
> +
> +       if (!nr_sects) {
> +               /* All zones */
> +               sector = 0;
> +               nr_sects = logical_to_sectors(sdkp->device, sdkp->capacity);
> +       }
> +
> +       spin_lock_irqsave(&q->zones_lock, flags);
> +       for (node = rb_first(&q->zones); node && nr_sects; node = rb_next(node)) {
> +               zone = rb_entry(node, struct blk_zone, node);
> +               if (sector < zone->start || sector >= (zone->start + zone->len))
> +                       continue;
> +               if (set) {
> +                       if (!test_and_set_bit_lock(BLK_ZONE_IN_UPDATE, &zone->flags))
> +                               nr_zones++;
> +               } else if (test_and_clear_bit(BLK_ZONE_IN_UPDATE, &zone->flags)) {
> +                       wake_up_bit(&zone->flags, BLK_ZONE_IN_UPDATE);
> +                       nr_zones++;
> +               }
> +               sector = zone->start + zone->len;
> +               if (nr_sects <= zone->len)
> +                       nr_sects = 0;
> +               else
> +                       nr_sects -= zone->len;
> +       }
> +       spin_unlock_irqrestore(&q->zones_lock, flags);
> +
> +       return nr_zones;
> +}
> +
> +static inline int sd_zbc_set_zones_updating(struct scsi_disk *sdkp,
> +                                           sector_t sector, sector_t nr_sects)
> +{
> +       return __sd_zbc_zones_updating(sdkp, sector, nr_sects, true);
> +}
> +
> +static inline int sd_zbc_clear_zones_updating(struct scsi_disk *sdkp,
> +                                             sector_t sector, sector_t nr_sects)
> +{
> +       return __sd_zbc_zones_updating(sdkp, sector, nr_sects, false);
> +}
> +
> +static void sd_zbc_start_queue(struct request_queue *q)
> +{
> +       unsigned long flags;
> +
> +       if (q->mq_ops) {
> +               blk_mq_start_hw_queues(q);
> +       } else {
> +               spin_lock_irqsave(q->queue_lock, flags);
> +               blk_start_queue(q);
> +               spin_unlock_irqrestore(q->queue_lock, flags);
> +       }
> +}
> +
> +static void sd_zbc_update_zone_work(struct work_struct *work)
> +{
> +       struct zbc_zone_work *zwork =
> +               container_of(work, struct zbc_zone_work, zone_work);
> +       struct scsi_disk *sdkp = zwork->sdkp;
> +       sector_t capacity = logical_to_sectors(sdkp->device, sdkp->capacity);
> +       struct request_queue *q = sdkp->disk->queue;
> +       sector_t end_sector, sector = zwork->sector;
> +       unsigned int bufsize;
> +       unsigned char *buf;
> +       int ret = -ENOMEM;
> +
> +       /* Get a buffer */
> +       if (!zwork->nr_zones) {
> +               bufsize = SD_ZBC_BUF_SIZE;
> +       } else {
> +               bufsize = (zwork->nr_zones + 1) * 64;
> +               if (bufsize < 512)
> +                       bufsize = 512;
> +               else if (bufsize > SD_ZBC_BUF_SIZE)
> +                               bufsize = SD_ZBC_BUF_SIZE;
> +               else
> +                       bufsize = (bufsize + 511) & ~511;
> +       }
> +       buf = kmalloc(bufsize, GFP_KERNEL | GFP_DMA);
> +       if (!buf) {
> +               sd_zbc_err(sdkp, "Failed to allocate zone report buffer\n");
> +               goto done_free;
> +       }
> +
> +       /* Process sector range */
> +       end_sector = zwork->sector + zwork->nr_sects;
> +       while(sector < min(end_sector, capacity)) {
> +
> +               /* Get zone report */
> +               ret = sd_zbc_report_zones(sdkp, buf, bufsize, sector,
> +                                         ZBC_ZONE_REPORTING_OPTION_ALL, true);
> +               if (ret)
> +                       break;
> +
> +               ret = zbc_parse_zones(sdkp, buf, bufsize, &sector);
> +               if (ret)
> +                       break;
> +
> +               /* Kick start the queue to allow requests waiting */
> +               /* for the zones just updated to run              */
> +               sd_zbc_start_queue(q);
> +
> +       }
> +
> +done_free:
> +       if (ret)
> +               sd_zbc_clear_zones_updating(sdkp, zwork->sector, zwork->nr_sects);
> +       if (buf)
> +               kfree(buf);
> +       kfree(zwork);
> +}
> +
> +/**
> + * sd_zbc_update_zones - Update zone information for zones starting
> + * from @start_sector. If not in init mode, the update is done only
> + * for zones marked with update flag.
> + * @sdkp: SCSI disk for which the zone information needs to be updated
> + * @start_sector: First sector of the first zone to be updated
> + * @bufsize: buffersize to be allocated for report zones
> + */
> +static int sd_zbc_update_zones(struct scsi_disk *sdkp,
> +                              sector_t sector, sector_t nr_sects,
> +                              gfp_t gfpflags, bool init)
> +{
> +       struct zbc_zone_work *zwork;
> +
> +       zwork = kzalloc(sizeof(struct zbc_zone_work), gfpflags);
> +       if (!zwork) {
> +               sd_zbc_err(sdkp, "Failed to allocate zone work\n");
> +               return -ENOMEM;
> +       }
> +
> +       if (!nr_sects) {
> +               /* All zones */
> +               sector = 0;
> +               nr_sects = logical_to_sectors(sdkp->device, sdkp->capacity);
> +       }
> +
> +       INIT_WORK(&zwork->zone_work, sd_zbc_update_zone_work);
> +       zwork->sdkp = sdkp;
> +       zwork->sector = sector;
> +       zwork->nr_sects = nr_sects;
> +       zwork->init = init;
> +
> +       if (!init)
> +               /* Mark the zones falling in the report as updating */
> +               zwork->nr_zones = sd_zbc_set_zones_updating(sdkp, sector, nr_sects);
> +
> +       if (init || zwork->nr_zones)
> +               queue_work(sdkp->zone_work_q, &zwork->zone_work);
> +       else
> +               kfree(zwork);
> +
> +       return 0;
> +}
> +
> +int sd_zbc_setup_report_cmnd(struct scsi_cmnd *cmd)
> +{
> +       struct request *rq = cmd->request;
> +       struct gendisk *disk = rq->rq_disk;
> +       struct scsi_disk *sdkp = scsi_disk(disk);
> +       int ret;
> +
> +       if (!sdkp->zone_work_q)
> +               return BLKPREP_KILL;
> +
> +       ret = sd_zbc_update_zones(sdkp, blk_rq_pos(rq), blk_rq_sectors(rq),
> +                                 GFP_ATOMIC, false);
> +       if (unlikely(ret))
> +               return BLKPREP_DEFER;
> +
> +       return BLKPREP_DONE;
> +}
> +
> +static void sd_zbc_setup_action_cmnd(struct scsi_cmnd *cmd,
> +                                    u8 action,
> +                                    bool all)
> +{
> +       struct request *rq = cmd->request;
> +       struct scsi_disk *sdkp = scsi_disk(rq->rq_disk);
> +       sector_t lba;
> +
> +       cmd->cmd_len = 16;
> +       cmd->cmnd[0] = ZBC_OUT;
> +       cmd->cmnd[1] = action;
> +       if (all) {
> +               cmd->cmnd[14] |= 0x01;
> +       } else {
> +               lba = sectors_to_logical(sdkp->device, blk_rq_pos(rq));
> +               put_unaligned_be64(lba, &cmd->cmnd[2]);
> +       }
> +
> +       rq->completion_data = NULL;
> +       rq->timeout = SD_TIMEOUT;
> +       rq->__data_len = blk_rq_bytes(rq);
> +
> +       /* Don't retry */
> +       cmd->allowed = 0;
> +       cmd->transfersize = 0;
> +       cmd->sc_data_direction = DMA_NONE;
> +}
> +
> +int sd_zbc_setup_reset_cmnd(struct scsi_cmnd *cmd)
> +{
> +       struct request *rq = cmd->request;
> +       struct scsi_disk *sdkp = scsi_disk(rq->rq_disk);
> +       sector_t sector = blk_rq_pos(rq);
> +       sector_t nr_sects = blk_rq_sectors(rq);
> +       struct blk_zone *zone = NULL;
> +       int ret = BLKPREP_OK;
> +
> +       if (nr_sects) {
> +               zone = blk_lookup_zone(rq->q, sector);
> +               if (!zone)
> +                       return BLKPREP_KILL;
> +       }
> +
> +       if (zone) {
> +
> +               blk_lock_zone(zone);
> +
> +               /* If the zone is being updated, wait */
> +               if (blk_zone_in_update(zone)) {
> +                       ret = BLKPREP_DEFER;
> +                       goto out;
> +               }
> +
> +               if (zone->type == BLK_ZONE_TYPE_UNKNOWN) {
> +                       sd_zbc_debug(sdkp,
> +                                    "Discarding unknown zone %zu\n",
> +                                    zone->start);
> +                       ret = BLKPREP_KILL;
> +                       goto out;
> +               }
> +
> +               /* Nothing to do for conventional sequential zones */
> +               if (blk_zone_is_conv(zone)) {
> +                       ret = BLKPREP_DONE;
> +                       goto out;
> +               }
> +
> +               if (!blk_try_write_lock_zone(zone)) {
> +                       ret = BLKPREP_DEFER;
> +                       goto out;
> +               }
> +
> +               /* Nothing to do if the zone is already empty */
> +               if (blk_zone_is_empty(zone)) {
> +                       blk_write_unlock_zone(zone);
> +                       ret = BLKPREP_DONE;
> +                       goto out;
> +               }
> +
> +               if (sector != zone->start ||
> +                   (nr_sects != zone->len)) {
> +                       sd_printk(KERN_ERR, sdkp,
> +                                 "Unaligned reset wp request, start %zu/%zu"
> +                                 " len %zu/%zu\n",
> +                                 zone->start, sector, zone->len, nr_sects);
> +                       blk_write_unlock_zone(zone);
> +                       ret = BLKPREP_KILL;
> +                       goto out;
> +               }
> +
> +       }
> +
> +       sd_zbc_setup_action_cmnd(cmd, ZO_RESET_WRITE_POINTER, !zone);
> +
> +out:
> +       if (zone) {
> +               if (ret == BLKPREP_OK) {
> +                       /*
> +                        * Opportunistic update. Will be fixed up
> +                        * with zone update if the command fails,
> +                        */
> +                       zone->wp = zone->start;
> +                       zone->cond = BLK_ZONE_COND_EMPTY;
> +                       zone->reset = 0;
> +                       zone->non_seq = 0;
> +               }
> +               blk_unlock_zone(zone);
> +       }
> +
> +       return ret;
> +}
> +
> +int sd_zbc_setup_open_cmnd(struct scsi_cmnd *cmd)
> +{
> +       struct request *rq = cmd->request;
> +       struct scsi_disk *sdkp = scsi_disk(rq->rq_disk);
> +       sector_t sector = blk_rq_pos(rq);
> +       sector_t nr_sects = blk_rq_sectors(rq);
> +       struct blk_zone *zone = NULL;
> +       int ret = BLKPREP_OK;
> +
> +       if (nr_sects) {
> +               zone = blk_lookup_zone(rq->q, sector);
> +               if (!zone)
> +                       return BLKPREP_KILL;
> +       }
> +
> +       if (zone) {
> +
> +               blk_lock_zone(zone);
> +
> +               /* If the zone is being updated, wait */
> +               if (blk_zone_in_update(zone)) {
> +                       ret = BLKPREP_DEFER;
> +                       goto out;
> +               }
> +
> +               if (zone->type == BLK_ZONE_TYPE_UNKNOWN) {
> +                       sd_zbc_debug(sdkp,
> +                                    "Opening unknown zone %zu\n",
> +                                    zone->start);
> +                       ret = BLKPREP_KILL;
> +                       goto out;
> +               }
> +
> +               /*
> +                * Nothing to do for conventional zones,
> +                * zones already open or full zones.
> +                */
> +               if (blk_zone_is_conv(zone) ||
> +                   blk_zone_is_open(zone) ||
> +                   blk_zone_is_full(zone)) {
> +                       ret = BLKPREP_DONE;
> +                       goto out;
> +               }
> +
> +               if (sector != zone->start ||
> +                   (nr_sects != zone->len)) {
> +                       sd_printk(KERN_ERR, sdkp,
> +                                 "Unaligned open zone request, start %zu/%zu"
> +                                 " len %zu/%zu\n",
> +                                 zone->start, sector, zone->len, nr_sects);
> +                       ret = BLKPREP_KILL;
> +                       goto out;
> +               }
> +
> +       }
> +
> +       sd_zbc_setup_action_cmnd(cmd, ZO_OPEN_ZONE, !zone);
> +
> +out:
> +       if (zone) {
> +               if (ret == BLKPREP_OK)
> +                       /*
> +                        * Opportunistic update. Will be fixed up
> +                        * with zone update if the command fails.
> +                        */
> +                       zone->cond = BLK_ZONE_COND_EXP_OPEN;
> +               blk_unlock_zone(zone);
> +       }
> +
> +       return ret;
> +}
> +
> +int sd_zbc_setup_close_cmnd(struct scsi_cmnd *cmd)
> +{
> +       struct request *rq = cmd->request;
> +       struct scsi_disk *sdkp = scsi_disk(rq->rq_disk);
> +       sector_t sector = blk_rq_pos(rq);
> +       sector_t nr_sects = blk_rq_sectors(rq);
> +       struct blk_zone *zone = NULL;
> +       int ret = BLKPREP_OK;
> +
> +       if (nr_sects) {
> +               zone = blk_lookup_zone(rq->q, sector);
> +               if (!zone)
> +                       return BLKPREP_KILL;
> +       }
> +
> +       if (zone) {
> +
> +               blk_lock_zone(zone);
> +
> +               /* If the zone is being updated, wait */
> +               if (blk_zone_in_update(zone)) {
> +                       ret = BLKPREP_DEFER;
> +                       goto out;
> +               }
> +
> +               if (zone->type == BLK_ZONE_TYPE_UNKNOWN) {
> +                       sd_zbc_debug(sdkp,
> +                                    "Closing unknown zone %zu\n",
> +                                    zone->start);
> +                       ret = BLKPREP_KILL;
> +                       goto out;
> +               }
> +
> +               /*
> +                * Nothing to do for conventional zones,
> +                * full zones or empty zones.
> +                */
> +               if (blk_zone_is_conv(zone) ||
> +                   blk_zone_is_full(zone) ||
> +                   blk_zone_is_empty(zone)) {
> +                       ret = BLKPREP_DONE;
> +                       goto out;
> +               }
> +
> +               if (sector != zone->start ||
> +                   (nr_sects != zone->len)) {
> +                       sd_printk(KERN_ERR, sdkp,
> +                                 "Unaligned close zone request, start %zu/%zu"
> +                                 " len %zu/%zu\n",
> +                                 zone->start, sector, zone->len, nr_sects);
> +                       ret = BLKPREP_KILL;
> +                       goto out;
> +               }
> +
> +       }
> +
> +       sd_zbc_setup_action_cmnd(cmd, ZO_CLOSE_ZONE, !zone);
> +
> +out:
> +       if (zone) {
> +               if (ret == BLKPREP_OK)
> +                       /*
> +                        * Opportunistic update. Will be fixed up
> +                        * with zone update if the command fails.
> +                        */
> +                       zone->cond = BLK_ZONE_COND_CLOSED;
> +               blk_unlock_zone(zone);
> +       }
> +
> +       return ret;
> +}
> +
> +int sd_zbc_setup_finish_cmnd(struct scsi_cmnd *cmd)
> +{
> +       struct request *rq = cmd->request;
> +       struct scsi_disk *sdkp = scsi_disk(rq->rq_disk);
> +       sector_t sector = blk_rq_pos(rq);
> +       sector_t nr_sects = blk_rq_sectors(rq);
> +       struct blk_zone *zone = NULL;
> +       int ret = BLKPREP_OK;
> +
> +       if (nr_sects) {
> +               zone = blk_lookup_zone(rq->q, sector);
> +               if (!zone)
> +                       return BLKPREP_KILL;
> +       }
> +
> +       if (zone) {
> +
> +               blk_lock_zone(zone);
> +
> +               /* If the zone is being updated, wait */
> +               if (blk_zone_in_update(zone)) {
> +                       ret = BLKPREP_DEFER;
> +                       goto out;
> +               }
> +
> +               if (zone->type == BLK_ZONE_TYPE_UNKNOWN) {
> +                       sd_zbc_debug(sdkp,
> +                                    "Finishing unknown zone %zu\n",
> +                                    zone->start);
> +                       ret = BLKPREP_KILL;
> +                       goto out;
> +               }
> +
> +               /* Nothing to do for conventional zones and full zones */
> +               if (blk_zone_is_conv(zone) ||
> +                   blk_zone_is_full(zone)) {
> +                       ret = BLKPREP_DONE;
> +                       goto out;
> +               }
> +
> +               if (sector != zone->start ||
> +                   (nr_sects != zone->len)) {
> +                       sd_printk(KERN_ERR, sdkp,
> +                                 "Unaligned finish zone request, start %zu/%zu"
> +                                 " len %zu/%zu\n",
> +                                 zone->start, sector, zone->len, nr_sects);
> +                       ret = BLKPREP_KILL;
> +                       goto out;
> +               }
> +
> +       }
> +
> +       sd_zbc_setup_action_cmnd(cmd, ZO_FINISH_ZONE, !zone);
> +
> +out:
> +       if (zone) {
> +               if (ret == BLKPREP_OK) {
> +                       /*
> +                        * Opportunistic update. Will be fixed up
> +                        * with zone update if the command fails.
> +                        */
> +                       zone->cond = BLK_ZONE_COND_FULL;
> +                       if (blk_zone_is_seq(zone))
> +                               zone->wp = zone->start + zone->len;
> +               }
> +               blk_unlock_zone(zone);
> +       }
> +
> +       return ret;
> +}
> +

Would be nice to have open/close/finish/reset share a little more code.

> +int sd_zbc_setup_read_write(struct scsi_disk *sdkp, struct request *rq,
> +                           sector_t sector, unsigned int *num_sectors)
> +{
> +       struct blk_zone *zone;
> +       unsigned int sectors = *num_sectors;
> +       int ret = BLKPREP_OK;
> +
> +       zone = blk_lookup_zone(rq->q, sector);
> +       if (!zone)
> +               /* Let the drive handle the request */
> +               return BLKPREP_OK;
> +
> +       blk_lock_zone(zone);
> +
> +       /* If the zone is being updated, wait */
> +       if (blk_zone_in_update(zone)) {
> +               ret = BLKPREP_DEFER;
> +               goto out;
> +       }
> +
> +       if (zone->type == BLK_ZONE_TYPE_UNKNOWN) {
> +               sd_zbc_debug(sdkp,
> +                            "Unknown zone %zu\n",
> +                            zone->start);
> +               ret = BLKPREP_KILL;
> +               goto out;
> +       }
> +
> +       /* For offline and read-only zones, let the drive fail the command */
> +       if (blk_zone_is_offline(zone) ||
> +           blk_zone_is_readonly(zone))
> +               goto out;
> +
> +       /* Do not allow zone boundaries crossing */
> +       if (sector + sectors > zone->start + zone->len) {
> +               ret = BLKPREP_KILL;
> +               goto out;
> +       }
> +
> +       /* For conventional zones, no checks */
> +       if (blk_zone_is_conv(zone))
> +               goto out;
> +
> +       if (req_op(rq) == REQ_OP_WRITE ||
> +           req_op(rq) == REQ_OP_WRITE_SAME) {
> +
> +               /*
> +                * Write requests may change the write pointer and
> +                * transition the zone condition to full. Changes
> +                * are oportunistic here. If the request fails, a
> +                * zone update will fix the zone information.
> +                */
> +               if (blk_zone_is_seq_req(zone)) {
> +
> +                       /*
> +                        * Do not issue more than one write at a time per
> +                        * zone. This solves write ordering problems due to
> +                        * the unlocking of the request queue in the dispatch
> +                        * path in the non scsi-mq case. For scsi-mq, this
> +                        * also avoids potential write reordering when multiple
> +                        * threads running on different CPUs write to the same
> +                        * zone (with a synchronized sequential pattern).
> +                        */
> +                       if (!blk_try_write_lock_zone(zone)) {
> +                               ret = BLKPREP_DEFER;
> +                               goto out;
> +                       }
> +
> +                       /* For host-managed drives, writes are allowed */
> +                       /* only at the write pointer position.         */
> +                       if (zone->wp != sector) {
> +                               blk_write_unlock_zone(zone);
> +                               ret = BLKPREP_KILL;
> +                               goto out;
> +                       }
> +
> +                       zone->wp += sectors;
> +                       if (zone->wp >= zone->start + zone->len) {
> +                               zone->cond = BLK_ZONE_COND_FULL;
> +                               zone->wp = zone->start + zone->len;
> +                       }
> +
> +               } else {
> +
> +                       /* For host-aware drives, writes are allowed */
> +                       /* anywhere in the zone, but wp can only go  */
> +                       /* forward.                                  */
> +                       sector_t end_sector = sector + sectors;
> +                       if (sector == zone->wp &&
> +                           end_sector >= zone->start + zone->len) {
> +                               zone->cond = BLK_ZONE_COND_FULL;
> +                               zone->wp = zone->start + zone->len;
> +                       } else if (end_sector > zone->wp) {
> +                               zone->wp = end_sector;
> +                       }
> +
> +               }
> +
> +       } else {
> +

If the drive does not have restricted reads
the just goto out here.

Not all HM drives will have restricted reads and
no HA drives have restricted reads.

> +               /* Check read after write pointer */
> +               if (sector + sectors <= zone->wp)
> +                       goto out;
> +
> +               if (zone->wp <= sector) {
> +                       /* Read beyond WP: clear request buffer */
> +                       struct req_iterator iter;
> +                       struct bio_vec bvec;
> +                       unsigned long flags;
> +                       void *buf;
> +                       rq_for_each_segment(bvec, rq, iter) {
> +                               buf = bvec_kmap_irq(&bvec, &flags);
> +                               memset(buf, 0, bvec.bv_len);
> +                               flush_dcache_page(bvec.bv_page);
> +                               bvec_kunmap_irq(buf, &flags);
> +                       }
> +                       ret = BLKPREP_DONE;
> +                       goto out;
> +               }
> +
> +               /* Read straddle WP position: limit request size */
> +               *num_sectors = zone->wp - sector;
> +
> +       }
> +
> +out:
> +       blk_unlock_zone(zone);
> +
> +       return ret;
> +}
> +
> +void sd_zbc_done(struct scsi_cmnd *cmd,
> +                struct scsi_sense_hdr *sshdr)
> +{
> +       int result = cmd->result;
> +       struct request *rq = cmd->request;
> +       struct scsi_disk *sdkp = scsi_disk(rq->rq_disk);
> +       struct request_queue *q = sdkp->disk->queue;
> +       sector_t pos = blk_rq_pos(rq);
> +       struct blk_zone *zone = NULL;
> +       bool write_unlock = false;
> +
> +       /*
> +        * Get the target zone of commands of interest. Some may
> +        * apply to all zones so check the request sectors first.
> +        */
> +       switch (req_op(rq)) {
> +       case REQ_OP_DISCARD:
> +       case REQ_OP_WRITE:
> +       case REQ_OP_WRITE_SAME:
> +       case REQ_OP_ZONE_RESET:
> +               write_unlock = true;
> +               /* fallthru */
> +       case REQ_OP_ZONE_OPEN:
> +       case REQ_OP_ZONE_CLOSE:
> +       case REQ_OP_ZONE_FINISH:
> +               if (blk_rq_sectors(rq))
> +                       zone = blk_lookup_zone(q, pos);
> +               break;
> +       }
> +
> +       if (zone && write_unlock)
> +           blk_write_unlock_zone(zone);
> +
> +       if (!result)
> +               return;
> +
> +       if (sshdr->sense_key == ILLEGAL_REQUEST &&
> +           sshdr->asc == 0x21)
> +               /*
> +                * It is unlikely that retrying requests failed with any
> +                * kind of alignement error will result in success. So don't
> +                * try. Report the error back to the user quickly so that
> +                * corrective actions can be taken after obtaining updated
> +                * zone information.
> +                */
> +               cmd->allowed = 0;
> +
> +       /* On error, force an update unless this is a failed report */
> +       if (req_op(rq) == REQ_OP_ZONE_REPORT)
> +               sd_zbc_clear_zones_updating(sdkp, pos, blk_rq_sectors(rq));
> +       else if (zone)
> +               sd_zbc_update_zones(sdkp, zone->start, zone->len,
> +                                   GFP_ATOMIC, false);
> +}
> +
> +void sd_zbc_read_zones(struct scsi_disk *sdkp, char *buf)
> +{
> +       struct request_queue *q = sdkp->disk->queue;
> +       struct blk_zone *zone;
> +       sector_t capacity;
> +       sector_t sector;
> +       bool init = false;
> +       u32 rep_len;
> +       int ret = 0;
> +
> +       if (sdkp->zoned != 1 && sdkp->device->type != TYPE_ZBC)
> +               /*
> +                * Device managed or normal SCSI disk,
> +                * no special handling required
> +                */
> +               return;
> +
> +       /* Do a report zone to get the maximum LBA to check capacity */
> +       ret = sd_zbc_report_zones(sdkp, buf, SD_BUF_SIZE,
> +                                 0, ZBC_ZONE_REPORTING_OPTION_ALL, false);
> +       if (ret < 0)
> +               return;
> +
> +       rep_len = get_unaligned_be32(&buf[0]);
> +       if (rep_len < 64) {
> +               sd_printk(KERN_WARNING, sdkp,
> +                         "REPORT ZONES report invalid length %u\n",
> +                         rep_len);
> +               return;
> +       }
> +
> +       if (sdkp->rc_basis == 0) {
> +               /* The max_lba field is the capacity of this device */
> +               sector_t lba = get_unaligned_be64(&buf[8]);
> +               if (lba + 1 > sdkp->capacity) {
> +                       if (sdkp->first_scan)
> +                               sd_printk(KERN_WARNING, sdkp,
> +                                         "Changing capacity from %zu "
> +                                         "to max LBA+1 %zu\n",
> +                                         sdkp->capacity,
> +                                         (sector_t) lba + 1);
> +                       sdkp->capacity = lba + 1;
> +               }
> +       }
> +
> +       /* Setup the zone work queue */
> +       if (! sdkp->zone_work_q) {
> +               sdkp->zone_work_q =
> +                       alloc_ordered_workqueue("zbc_wq_%s", WQ_MEM_RECLAIM,
> +                                               sdkp->disk->disk_name);
> +               if (!sdkp->zone_work_q) {
> +                       sdev_printk(KERN_WARNING, sdkp->device,
> +                                   "Create zoned disk workqueue failed\n");
> +                       return;
> +               }
> +               init = true;
> +       }
> +
> +       /*
> +        * Parse what we already got. If all zones are not parsed yet,
> +        * kick start an update to get the remaining.
> +        */
> +       capacity = logical_to_sectors(sdkp->device, sdkp->capacity);
> +       ret = zbc_parse_zones(sdkp, buf, SD_BUF_SIZE, &sector);
> +       if (ret == 0 && sector < capacity) {
> +               sd_zbc_update_zones(sdkp, sector, capacity - sector,
> +                                   GFP_KERNEL, init);
> +               drain_workqueue(sdkp->zone_work_q);
> +       }
> +       if (ret)
> +               return;
> +
> +       /*
> +        * Analyze the zones layout: if all zones are the same size and
> +        * the size is a power of 2, chunk the device and map discard to
> +        * reset write pointer command. Otherwise, disable discard.
> +        */
> +       sdkp->zone_sectors = 0;
> +       sdkp->nr_zones = 0;
> +       sector = 0;
> +       while(sector < capacity) {
> +
> +               zone = blk_lookup_zone(q, sector);
> +               if (!zone) {
> +                       sdkp->zone_sectors = 0;
> +                       sdkp->nr_zones = 0;
> +                       break;
> +               }
> +
> +               sector += zone->len;
> +
> +               if (sdkp->zone_sectors == 0) {
> +                       sdkp->zone_sectors = zone->len;
> +               } else if (sector != capacity &&
> +                        zone->len != sdkp->zone_sectors) {
> +                       sdkp->zone_sectors = 0;
> +                       sdkp->nr_zones = 0;
> +                       break;
> +               }
> +
> +               sdkp->nr_zones++;
> +
> +       }
> +
> +       if (!sdkp->zone_sectors ||
> +           !is_power_of_2(sdkp->zone_sectors)) {
> +               sd_config_discard(sdkp, SD_LBP_DISABLE);
> +               if (sdkp->first_scan)
> +                       sd_printk(KERN_NOTICE, sdkp,
> +                                 "%u zones (non constant zone size)\n",
> +                                 sdkp->nr_zones);
> +               return;
> +       }
> +
> +       /* Setup discard granularity to the zone size */
> +       blk_queue_chunk_sectors(sdkp->disk->queue, sdkp->zone_sectors);
> +       sdkp->max_unmap_blocks = sdkp->zone_sectors;
> +       sdkp->unmap_alignment = sectors_to_logical(sdkp->device,
> +                                                  sdkp->zone_sectors);
> +       sdkp->unmap_granularity = sdkp->unmap_alignment;
> +       sd_config_discard(sdkp, SD_ZBC_RESET_WP);
> +
> +       if (sdkp->first_scan) {
> +               if (sdkp->nr_zones * sdkp->zone_sectors == capacity)
> +                       sd_printk(KERN_NOTICE, sdkp,
> +                                 "%u zones of %zu sectors\n",
> +                                 sdkp->nr_zones,
> +                                 sdkp->zone_sectors);
> +               else
> +                       sd_printk(KERN_NOTICE, sdkp,
> +                                 "%u zones of %zu sectors "
> +                                 "+ 1 runt zone\n",
> +                                 sdkp->nr_zones - 1,
> +                                 sdkp->zone_sectors);
> +       }
> +}
> +
> +void sd_zbc_remove(struct scsi_disk *sdkp)
> +{
> +
> +       sd_config_discard(sdkp, SD_LBP_DISABLE);
> +
> +       if (sdkp->zone_work_q) {
> +               drain_workqueue(sdkp->zone_work_q);
> +               destroy_workqueue(sdkp->zone_work_q);
> +               sdkp->zone_work_q = NULL;
> +               blk_drop_zones(sdkp->disk->queue);
> +       }
> +}
> +
> diff --git a/include/scsi/scsi_proto.h b/include/scsi/scsi_proto.h
> index d1defd1..6ba66e0 100644
> --- a/include/scsi/scsi_proto.h
> +++ b/include/scsi/scsi_proto.h
> @@ -299,4 +299,21 @@ struct scsi_lun {
>  #define SCSI_ACCESS_STATE_MASK        0x0f
>  #define SCSI_ACCESS_STATE_PREFERRED   0x80
>
> +/* Reporting options for REPORT ZONES */
> +enum zbc_zone_reporting_options {
> +       ZBC_ZONE_REPORTING_OPTION_ALL = 0,
> +       ZBC_ZONE_REPORTING_OPTION_EMPTY,
> +       ZBC_ZONE_REPORTING_OPTION_IMPLICIT_OPEN,
> +       ZBC_ZONE_REPORTING_OPTION_EXPLICIT_OPEN,
> +       ZBC_ZONE_REPORTING_OPTION_CLOSED,
> +       ZBC_ZONE_REPORTING_OPTION_FULL,
> +       ZBC_ZONE_REPORTING_OPTION_READONLY,
> +       ZBC_ZONE_REPORTING_OPTION_OFFLINE,
> +       ZBC_ZONE_REPORTING_OPTION_NEED_RESET_WP = 0x10,
> +       ZBC_ZONE_REPORTING_OPTION_NON_SEQWRITE,
> +       ZBC_ZONE_REPORTING_OPTION_NON_WP = 0x3f,
> +};
> +
> +#define ZBC_REPORT_ZONE_PARTIAL 0x80
> +

Why don't we expose these enums via uapi?


>  #endif /* _SCSI_PROTO_H_ */
> --
> 2.7.4
>
> Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice & Disclaimer:
>
> This e-mail and any files transmitted with it may contain confidential or legally privileged information of WDC and/or its affiliates, and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system.
>



-- 
Shaun Tancheff

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 9/9] blk-zoned: Add ioctl interface for zone operations
  2016-09-19 21:27   ` Damien Le Moal
@ 2016-09-20  6:02     ` Shaun Tancheff
  -1 siblings, 0 replies; 36+ messages in thread
From: Shaun Tancheff @ 2016-09-20  6:02 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: linux-scsi, linux-block, Martin K. Petersen, Jens Axboe, Hannes Reinecke

On Mon, Sep 19, 2016 at 4:27 PM, Damien Le Moal <damien.lemoal@hgst.com> wr=
ote:
> From: Shaun Tancheff <shaun.tancheff@seagate.com>
>
> Adds the new BLKUPDATEZONES, BLKREPORTZONE, BLKRESETZONE,
> BLKOPENZONE, BLKCLOSEZONE and BLKFINISHZONE ioctls.
>
> BLKREPORTZONE implementation uses the device queue zone RB-tree by
> default and no actual command is issued to the device. If the
> application needs access to the untracked zone attributes (non-seq
> flag or reset recommended flag, offline or read-only zone condition,
> etc), BLKUPDATEZONES must be issued first to force an update of the
> cached zone information.
>
> Changelog (Damien):
> * Simplified blkzone descriptor (removed bit-fields and use CPU
>   endianness)
> * Changed report ioctl to operate on single zone instead of an
>   array of blkzone structures.

I think something with this degree of changes from what
I posted should not include my signed-off-by.

I also really don't like forcing the reply to be a single zone. I
think the user should be able to ask for as many or as few as
they would like.

> Signed-off-by: Shaun Tancheff <shaun.tancheff@seagate.com>
> Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
> ---
>  block/blk-zoned.c             | 115 ++++++++++++++++++++++++++++++++++++=
++++++
>  block/ioctl.c                 |   8 +++
>  include/linux/blkdev.h        |   7 +++
>  include/uapi/linux/Kbuild     |   1 +
>  include/uapi/linux/blkzoned.h |  91 +++++++++++++++++++++++++++++++++
>  include/uapi/linux/fs.h       |   1 +
>  6 files changed, 223 insertions(+)
>  create mode 100644 include/uapi/linux/blkzoned.h
>
> diff --git a/block/blk-zoned.c b/block/blk-zoned.c
> index a107940..71205c8 100644
> --- a/block/blk-zoned.c
> +++ b/block/blk-zoned.c
> @@ -12,6 +12,7 @@
>  #include <linux/module.h>
>  #include <linux/rbtree.h>
>  #include <linux/blkdev.h>
> +#include <linux/blkzoned.h>
>
>  void blk_init_zones(struct request_queue *q)
>  {
> @@ -336,3 +337,117 @@ int blkdev_finish_zone(struct block_device *bdev,
>         return blkdev_issue_zone_action(bdev, sector, REQ_OP_ZONE_FINISH,
>                                         gfp_mask);
>  }
> +
> +static int blkdev_report_zone_ioctl(struct block_device *bdev,
> +                                   void __user *argp)
> +{
> +       struct blk_zone *zone;
> +       struct blkzone z;
> +
> +       if (copy_from_user(&z, argp, sizeof(struct blkzone)))
> +               return -EFAULT;
> +
> +       zone =3D blk_lookup_zone(bdev_get_queue(bdev), z.start);
> +       if (!zone)
> +               return -EINVAL;
> +
> +       memset(&z, 0, sizeof(struct blkzone));
> +
> +       blk_lock_zone(zone);
> +
> +       blk_wait_for_zone_update(zone);
> +
> +       z.len =3D zone->len;
> +       z.start =3D zone->start;
> +       z.wp =3D zone->wp;
> +       z.type =3D zone->type;
> +       z.cond =3D zone->cond;
> +       z.non_seq =3D zone->non_seq;
> +       z.reset =3D zone->reset;
> +
> +       blk_unlock_zone(zone);
> +
> +       if (copy_to_user(argp, &z, sizeof(struct blkzone)))
> +               return -EFAULT;
> +
> +       return 0;
> +}
> +
> +static int blkdev_zone_action_ioctl(struct block_device *bdev,
> +                                   unsigned cmd, void __user *argp)
> +{
> +       unsigned int op;
> +       u64 sector;
> +
> +       if (get_user(sector, (u64 __user *)argp))
> +               return -EFAULT;
> +
> +       switch (cmd) {
> +       case BLKRESETZONE:
> +               op =3D REQ_OP_ZONE_RESET;
> +               break;
> +       case BLKOPENZONE:
> +               op =3D REQ_OP_ZONE_OPEN;
> +               break;
> +       case BLKCLOSEZONE:
> +               op =3D REQ_OP_ZONE_CLOSE;
> +               break;
> +       case BLKFINISHZONE:
> +               op =3D REQ_OP_ZONE_FINISH;
> +               break;
> +       }
> +
> +       return blkdev_issue_zone_action(bdev, sector, op, GFP_KERNEL);
> +}
> +
> +/**
> + * Called from blkdev_ioctl.
> + */
> +int blkdev_zone_ioctl(struct block_device *bdev, fmode_t mode,
> +                     unsigned cmd, unsigned long arg)
> +{
> +       void __user *argp =3D (void __user *)arg;
> +       struct request_queue *q;
> +       int ret;
> +
> +       if (!argp)
> +               return -EINVAL;
> +
> +       q =3D bdev_get_queue(bdev);
> +       if (!q)
> +               return -ENXIO;
> +
> +       if (!blk_queue_zoned(q))
> +               return -ENOTTY;
> +
> +       if (!capable(CAP_SYS_ADMIN))
> +               return -EACCES;
> +
> +       switch (cmd) {
> +       case BLKREPORTZONE:
> +               ret =3D blkdev_report_zone_ioctl(bdev, argp);
> +               break;
> +       case BLKUPDATEZONES:
> +               if (!(mode & FMODE_WRITE)) {
> +                       ret =3D -EBADF;
> +                       break;
> +               }
> +               ret =3D blkdev_update_zones(bdev, GFP_KERNEL);
> +               break;
> +       case BLKRESETZONE:
> +       case BLKOPENZONE:
> +       case BLKCLOSEZONE:
> +       case BLKFINISHZONE:
> +               if (!(mode & FMODE_WRITE)) {
> +                       ret =3D -EBADF;
> +                       break;
> +               }
> +               ret =3D blkdev_zone_action_ioctl(bdev, cmd, argp);
> +               break;
> +       default:
> +               ret =3D -ENOTTY;
> +               break;
> +       }
> +
> +       return ret;
> +}
> diff --git a/block/ioctl.c b/block/ioctl.c
> index ed2397f..f09679a 100644
> --- a/block/ioctl.c
> +++ b/block/ioctl.c
> @@ -3,6 +3,7 @@
>  #include <linux/export.h>
>  #include <linux/gfp.h>
>  #include <linux/blkpg.h>
> +#include <linux/blkzoned.h>
>  #include <linux/hdreg.h>
>  #include <linux/backing-dev.h>
>  #include <linux/fs.h>
> @@ -513,6 +514,13 @@ int blkdev_ioctl(struct block_device *bdev, fmode_t =
mode, unsigned cmd,
>                                 BLKDEV_DISCARD_SECURE);
>         case BLKZEROOUT:
>                 return blk_ioctl_zeroout(bdev, mode, arg);
> +       case BLKUPDATEZONES:
> +       case BLKREPORTZONE:
> +       case BLKRESETZONE:
> +       case BLKOPENZONE:
> +       case BLKCLOSEZONE:
> +       case BLKFINISHZONE:
> +               return blkdev_zone_ioctl(bdev, mode, cmd, arg);
>         case HDIO_GETGEO:
>                 return blkdev_getgeo(bdev, argp);
>         case BLKRAGET:
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index a85f95b..0299d41 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -405,9 +405,16 @@ extern int blkdev_reset_zone(struct block_device *, =
sector_t, gfp_t);
>  extern int blkdev_open_zone(struct block_device *, sector_t, gfp_t);
>  extern int blkdev_close_zone(struct block_device *, sector_t, gfp_t);
>  extern int blkdev_finish_zone(struct block_device *, sector_t, gfp_t);
> +extern int blkdev_zone_ioctl(struct block_device *, fmode_t, unsigned in=
t,
> +                            unsigned long);
>  #else /* CONFIG_BLK_DEV_ZONED */
>  static inline void blk_init_zones(struct request_queue *q) { };
>  static inline void blk_drop_zones(struct request_queue *q) { };
> +static inline int blkdev_zone_ioctl(struct block_device *bdev, fmode_t m=
ode,
> +                                   unsigned cmd, unsigned long arg)
> +{
> +       return -ENOTTY;
> +}
>  #endif /* CONFIG_BLK_DEV_ZONED */
>
>  struct request_queue {
> diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild
> index 185f8ea..a2a7522 100644
> --- a/include/uapi/linux/Kbuild
> +++ b/include/uapi/linux/Kbuild
> @@ -70,6 +70,7 @@ header-y +=3D bfs_fs.h
>  header-y +=3D binfmts.h
>  header-y +=3D blkpg.h
>  header-y +=3D blktrace_api.h
> +header-y +=3D blkzoned.h
>  header-y +=3D bpf_common.h
>  header-y +=3D bpf.h
>  header-y +=3D bpqether.h
> diff --git a/include/uapi/linux/blkzoned.h b/include/uapi/linux/blkzoned.=
h
> new file mode 100644
> index 0000000..23a2702
> --- /dev/null
> +++ b/include/uapi/linux/blkzoned.h
> @@ -0,0 +1,91 @@
> +/*
> + * Zoned block devices handling.
> + *
> + * Copyright (C) 2015 Seagate Technology PLC
> + *
> + * Written by: Shaun Tancheff <shaun.tancheff@seagate.com>
> + *
> + * Modified by: Damien Le Moal <damien.lemoal@hgst.com>
> + * Copyright (C) 2016 Western Digital
> + *
> + * This file is licensed under  the terms of the GNU General Public
> + * License version 2. This program is licensed "as is" without any
> + * warranty of any kind, whether express or implied.
> + */
> +#ifndef _UAPI_BLKZONED_H
> +#define _UAPI_BLKZONED_H
> +
> +#include <linux/types.h>
> +#include <linux/ioctl.h>
> +
> +/*
> + * Zone type.
> + */
> +enum blkzone_type {
> +       BLKZONE_TYPE_UNKNOWN,
> +       BLKZONE_TYPE_CONVENTIONAL,
> +       BLKZONE_TYPE_SEQWRITE_REQ,
> +       BLKZONE_TYPE_SEQWRITE_PREF,
> +};
> +
> +/*
> + * Zone condition.
> + */
> +enum blkzone_cond {
> +       BLKZONE_COND_NO_WP,
> +       BLKZONE_COND_EMPTY,
> +       BLKZONE_COND_IMP_OPEN,
> +       BLKZONE_COND_EXP_OPEN,
> +       BLKZONE_COND_CLOSED,
> +       BLKZONE_COND_READONLY =3D 0xd,
> +       BLKZONE_COND_FULL,
> +       BLKZONE_COND_OFFLINE,
> +};
> +
> +/*
> + * Zone descriptor for BLKREPORTZONE.
> + * start, len and wp use the regulare 512 B sector unit,
> + * regardless of the device logical block size. The overall
> + * structure size is 64 B to match the ZBC/ZAC defined zone descriptor
> + * and allow support for future additional zone information.
> + */
> +struct blkzone {
> +       __u64   start;          /* Zone start sector */
> +       __u64   len;            /* Zone length in number of sectors */
> +       __u64   wp;             /* Zone write pointer position */
> +       __u8    type;           /* Zone type */
> +       __u8    cond;           /* Zone condition */
> +       __u8    non_seq;        /* Non-sequential write resources active =
*/
> +       __u8    reset;          /* Reset write pointer recommended */
> +       __u8    reserved[36];
> +};
> +
> +/*
> + * Zone ioctl's:
> + *
> + * BLKUPDATEZONES      : Force update of all zones information
> + * BLKREPORTZONE       : Get a zone descriptor. Takes a zone descriptor =
as
> + *                        argument. The zone to report is the one
> + *                        containing the sector initially specified in t=
he
> + *                        descriptor start field.
> + * BLKRESETZONE                : Reset the write pointer of the zone con=
taining the
> + *                        specified sector, or of all written zones if t=
he
> + *                        sector is ~0ull.
> + * BLKOPENZONE         : Explicitely open the zone containing the
> + *                        specified sector, or all possible zones if the
> + *                        sector is ~0ull (the drive determines which zo=
ne
> + *                        to open in this case).
> + * BLKCLOSEZONE                : Close the zone containing the specified=
 sector, or
> + *                        all open zones if the sector is ~0ull.
> + * BLKFINISHZONE       : Finish the zone (make it full) containing the
> + *                        specified sector, or all open and closed zones=
 if
> + *                        the sector is ~0ull.
> + */
> +#define BLKUPDATEZONES _IO(0x12,130)
> +#define BLKREPORTZONE  _IOWR(0x12,131,struct blkzone)
> +#define BLKRESETZONE   _IOW(0x12,132,unsigned long long)
> +#define BLKOPENZONE    _IOW(0x12,133,unsigned long long)
> +#define BLKCLOSEZONE   _IOW(0x12,134,unsigned long long)
> +#define BLKFINISHZONE  _IOW(0x12,135,unsigned long long)
> +
> +#endif /* _UAPI_BLKZONED_H */
> diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
> index 3b00f7c..1db6d66 100644
> --- a/include/uapi/linux/fs.h
> +++ b/include/uapi/linux/fs.h
> @@ -222,6 +222,7 @@ struct fsxattr {
>  #define BLKSECDISCARD _IO(0x12,125)
>  #define BLKROTATIONAL _IO(0x12,126)
>  #define BLKZEROOUT _IO(0x12,127)
> +/* A jump here: 130-135 are used for zoned block devices (see uapi/linux=
/blkzoned.h) */
>
>  #define BMAP_IOCTL 1           /* obsolete - kept for compatibility */
>  #define FIBMAP    _IO(0x00,1)  /* bmap access */
> --
> 2.7.4
>
> Western Digital Corporation (and its subsidiaries) E-mail Confidentiality=
 Notice & Disclaimer:
>
> This e-mail and any files transmitted with it may contain confidential or=
 legally privileged information of WDC and/or its affiliates, and are inten=
ded solely for the use of the individual or entity to which they are addres=
sed. If you are not the intended recipient, any disclosure, copying, distri=
bution or any action taken or omitted to be taken in reliance on it, is pro=
hibited. If you have received this e-mail in error, please notify the sende=
r immediately and delete the e-mail in its entirety from your system.
>



--=20
Shaun Tancheff

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 9/9] blk-zoned: Add ioctl interface for zone operations
@ 2016-09-20  6:02     ` Shaun Tancheff
  0 siblings, 0 replies; 36+ messages in thread
From: Shaun Tancheff @ 2016-09-20  6:02 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: linux-scsi, linux-block, Martin K. Petersen, Jens Axboe, Hannes Reinecke

On Mon, Sep 19, 2016 at 4:27 PM, Damien Le Moal <damien.lemoal@hgst.com> wrote:
> From: Shaun Tancheff <shaun.tancheff@seagate.com>
>
> Adds the new BLKUPDATEZONES, BLKREPORTZONE, BLKRESETZONE,
> BLKOPENZONE, BLKCLOSEZONE and BLKFINISHZONE ioctls.
>
> BLKREPORTZONE implementation uses the device queue zone RB-tree by
> default and no actual command is issued to the device. If the
> application needs access to the untracked zone attributes (non-seq
> flag or reset recommended flag, offline or read-only zone condition,
> etc), BLKUPDATEZONES must be issued first to force an update of the
> cached zone information.
>
> Changelog (Damien):
> * Simplified blkzone descriptor (removed bit-fields and use CPU
>   endianness)
> * Changed report ioctl to operate on single zone instead of an
>   array of blkzone structures.

I think something with this degree of changes from what
I posted should not include my signed-off-by.

I also really don't like forcing the reply to be a single zone. I
think the user should be able to ask for as many or as few as
they would like.

> Signed-off-by: Shaun Tancheff <shaun.tancheff@seagate.com>
> Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
> ---
>  block/blk-zoned.c             | 115 ++++++++++++++++++++++++++++++++++++++++++
>  block/ioctl.c                 |   8 +++
>  include/linux/blkdev.h        |   7 +++
>  include/uapi/linux/Kbuild     |   1 +
>  include/uapi/linux/blkzoned.h |  91 +++++++++++++++++++++++++++++++++
>  include/uapi/linux/fs.h       |   1 +
>  6 files changed, 223 insertions(+)
>  create mode 100644 include/uapi/linux/blkzoned.h
>
> diff --git a/block/blk-zoned.c b/block/blk-zoned.c
> index a107940..71205c8 100644
> --- a/block/blk-zoned.c
> +++ b/block/blk-zoned.c
> @@ -12,6 +12,7 @@
>  #include <linux/module.h>
>  #include <linux/rbtree.h>
>  #include <linux/blkdev.h>
> +#include <linux/blkzoned.h>
>
>  void blk_init_zones(struct request_queue *q)
>  {
> @@ -336,3 +337,117 @@ int blkdev_finish_zone(struct block_device *bdev,
>         return blkdev_issue_zone_action(bdev, sector, REQ_OP_ZONE_FINISH,
>                                         gfp_mask);
>  }
> +
> +static int blkdev_report_zone_ioctl(struct block_device *bdev,
> +                                   void __user *argp)
> +{
> +       struct blk_zone *zone;
> +       struct blkzone z;
> +
> +       if (copy_from_user(&z, argp, sizeof(struct blkzone)))
> +               return -EFAULT;
> +
> +       zone = blk_lookup_zone(bdev_get_queue(bdev), z.start);
> +       if (!zone)
> +               return -EINVAL;
> +
> +       memset(&z, 0, sizeof(struct blkzone));
> +
> +       blk_lock_zone(zone);
> +
> +       blk_wait_for_zone_update(zone);
> +
> +       z.len = zone->len;
> +       z.start = zone->start;
> +       z.wp = zone->wp;
> +       z.type = zone->type;
> +       z.cond = zone->cond;
> +       z.non_seq = zone->non_seq;
> +       z.reset = zone->reset;
> +
> +       blk_unlock_zone(zone);
> +
> +       if (copy_to_user(argp, &z, sizeof(struct blkzone)))
> +               return -EFAULT;
> +
> +       return 0;
> +}
> +
> +static int blkdev_zone_action_ioctl(struct block_device *bdev,
> +                                   unsigned cmd, void __user *argp)
> +{
> +       unsigned int op;
> +       u64 sector;
> +
> +       if (get_user(sector, (u64 __user *)argp))
> +               return -EFAULT;
> +
> +       switch (cmd) {
> +       case BLKRESETZONE:
> +               op = REQ_OP_ZONE_RESET;
> +               break;
> +       case BLKOPENZONE:
> +               op = REQ_OP_ZONE_OPEN;
> +               break;
> +       case BLKCLOSEZONE:
> +               op = REQ_OP_ZONE_CLOSE;
> +               break;
> +       case BLKFINISHZONE:
> +               op = REQ_OP_ZONE_FINISH;
> +               break;
> +       }
> +
> +       return blkdev_issue_zone_action(bdev, sector, op, GFP_KERNEL);
> +}
> +
> +/**
> + * Called from blkdev_ioctl.
> + */
> +int blkdev_zone_ioctl(struct block_device *bdev, fmode_t mode,
> +                     unsigned cmd, unsigned long arg)
> +{
> +       void __user *argp = (void __user *)arg;
> +       struct request_queue *q;
> +       int ret;
> +
> +       if (!argp)
> +               return -EINVAL;
> +
> +       q = bdev_get_queue(bdev);
> +       if (!q)
> +               return -ENXIO;
> +
> +       if (!blk_queue_zoned(q))
> +               return -ENOTTY;
> +
> +       if (!capable(CAP_SYS_ADMIN))
> +               return -EACCES;
> +
> +       switch (cmd) {
> +       case BLKREPORTZONE:
> +               ret = blkdev_report_zone_ioctl(bdev, argp);
> +               break;
> +       case BLKUPDATEZONES:
> +               if (!(mode & FMODE_WRITE)) {
> +                       ret = -EBADF;
> +                       break;
> +               }
> +               ret = blkdev_update_zones(bdev, GFP_KERNEL);
> +               break;
> +       case BLKRESETZONE:
> +       case BLKOPENZONE:
> +       case BLKCLOSEZONE:
> +       case BLKFINISHZONE:
> +               if (!(mode & FMODE_WRITE)) {
> +                       ret = -EBADF;
> +                       break;
> +               }
> +               ret = blkdev_zone_action_ioctl(bdev, cmd, argp);
> +               break;
> +       default:
> +               ret = -ENOTTY;
> +               break;
> +       }
> +
> +       return ret;
> +}
> diff --git a/block/ioctl.c b/block/ioctl.c
> index ed2397f..f09679a 100644
> --- a/block/ioctl.c
> +++ b/block/ioctl.c
> @@ -3,6 +3,7 @@
>  #include <linux/export.h>
>  #include <linux/gfp.h>
>  #include <linux/blkpg.h>
> +#include <linux/blkzoned.h>
>  #include <linux/hdreg.h>
>  #include <linux/backing-dev.h>
>  #include <linux/fs.h>
> @@ -513,6 +514,13 @@ int blkdev_ioctl(struct block_device *bdev, fmode_t mode, unsigned cmd,
>                                 BLKDEV_DISCARD_SECURE);
>         case BLKZEROOUT:
>                 return blk_ioctl_zeroout(bdev, mode, arg);
> +       case BLKUPDATEZONES:
> +       case BLKREPORTZONE:
> +       case BLKRESETZONE:
> +       case BLKOPENZONE:
> +       case BLKCLOSEZONE:
> +       case BLKFINISHZONE:
> +               return blkdev_zone_ioctl(bdev, mode, cmd, arg);
>         case HDIO_GETGEO:
>                 return blkdev_getgeo(bdev, argp);
>         case BLKRAGET:
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index a85f95b..0299d41 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -405,9 +405,16 @@ extern int blkdev_reset_zone(struct block_device *, sector_t, gfp_t);
>  extern int blkdev_open_zone(struct block_device *, sector_t, gfp_t);
>  extern int blkdev_close_zone(struct block_device *, sector_t, gfp_t);
>  extern int blkdev_finish_zone(struct block_device *, sector_t, gfp_t);
> +extern int blkdev_zone_ioctl(struct block_device *, fmode_t, unsigned int,
> +                            unsigned long);
>  #else /* CONFIG_BLK_DEV_ZONED */
>  static inline void blk_init_zones(struct request_queue *q) { };
>  static inline void blk_drop_zones(struct request_queue *q) { };
> +static inline int blkdev_zone_ioctl(struct block_device *bdev, fmode_t mode,
> +                                   unsigned cmd, unsigned long arg)
> +{
> +       return -ENOTTY;
> +}
>  #endif /* CONFIG_BLK_DEV_ZONED */
>
>  struct request_queue {
> diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild
> index 185f8ea..a2a7522 100644
> --- a/include/uapi/linux/Kbuild
> +++ b/include/uapi/linux/Kbuild
> @@ -70,6 +70,7 @@ header-y += bfs_fs.h
>  header-y += binfmts.h
>  header-y += blkpg.h
>  header-y += blktrace_api.h
> +header-y += blkzoned.h
>  header-y += bpf_common.h
>  header-y += bpf.h
>  header-y += bpqether.h
> diff --git a/include/uapi/linux/blkzoned.h b/include/uapi/linux/blkzoned.h
> new file mode 100644
> index 0000000..23a2702
> --- /dev/null
> +++ b/include/uapi/linux/blkzoned.h
> @@ -0,0 +1,91 @@
> +/*
> + * Zoned block devices handling.
> + *
> + * Copyright (C) 2015 Seagate Technology PLC
> + *
> + * Written by: Shaun Tancheff <shaun.tancheff@seagate.com>
> + *
> + * Modified by: Damien Le Moal <damien.lemoal@hgst.com>
> + * Copyright (C) 2016 Western Digital
> + *
> + * This file is licensed under  the terms of the GNU General Public
> + * License version 2. This program is licensed "as is" without any
> + * warranty of any kind, whether express or implied.
> + */
> +#ifndef _UAPI_BLKZONED_H
> +#define _UAPI_BLKZONED_H
> +
> +#include <linux/types.h>
> +#include <linux/ioctl.h>
> +
> +/*
> + * Zone type.
> + */
> +enum blkzone_type {
> +       BLKZONE_TYPE_UNKNOWN,
> +       BLKZONE_TYPE_CONVENTIONAL,
> +       BLKZONE_TYPE_SEQWRITE_REQ,
> +       BLKZONE_TYPE_SEQWRITE_PREF,
> +};
> +
> +/*
> + * Zone condition.
> + */
> +enum blkzone_cond {
> +       BLKZONE_COND_NO_WP,
> +       BLKZONE_COND_EMPTY,
> +       BLKZONE_COND_IMP_OPEN,
> +       BLKZONE_COND_EXP_OPEN,
> +       BLKZONE_COND_CLOSED,
> +       BLKZONE_COND_READONLY = 0xd,
> +       BLKZONE_COND_FULL,
> +       BLKZONE_COND_OFFLINE,
> +};
> +
> +/*
> + * Zone descriptor for BLKREPORTZONE.
> + * start, len and wp use the regulare 512 B sector unit,
> + * regardless of the device logical block size. The overall
> + * structure size is 64 B to match the ZBC/ZAC defined zone descriptor
> + * and allow support for future additional zone information.
> + */
> +struct blkzone {
> +       __u64   start;          /* Zone start sector */
> +       __u64   len;            /* Zone length in number of sectors */
> +       __u64   wp;             /* Zone write pointer position */
> +       __u8    type;           /* Zone type */
> +       __u8    cond;           /* Zone condition */
> +       __u8    non_seq;        /* Non-sequential write resources active */
> +       __u8    reset;          /* Reset write pointer recommended */
> +       __u8    reserved[36];
> +};
> +
> +/*
> + * Zone ioctl's:
> + *
> + * BLKUPDATEZONES      : Force update of all zones information
> + * BLKREPORTZONE       : Get a zone descriptor. Takes a zone descriptor as
> + *                        argument. The zone to report is the one
> + *                        containing the sector initially specified in the
> + *                        descriptor start field.
> + * BLKRESETZONE                : Reset the write pointer of the zone containing the
> + *                        specified sector, or of all written zones if the
> + *                        sector is ~0ull.
> + * BLKOPENZONE         : Explicitely open the zone containing the
> + *                        specified sector, or all possible zones if the
> + *                        sector is ~0ull (the drive determines which zone
> + *                        to open in this case).
> + * BLKCLOSEZONE                : Close the zone containing the specified sector, or
> + *                        all open zones if the sector is ~0ull.
> + * BLKFINISHZONE       : Finish the zone (make it full) containing the
> + *                        specified sector, or all open and closed zones if
> + *                        the sector is ~0ull.
> + */
> +#define BLKUPDATEZONES _IO(0x12,130)
> +#define BLKREPORTZONE  _IOWR(0x12,131,struct blkzone)
> +#define BLKRESETZONE   _IOW(0x12,132,unsigned long long)
> +#define BLKOPENZONE    _IOW(0x12,133,unsigned long long)
> +#define BLKCLOSEZONE   _IOW(0x12,134,unsigned long long)
> +#define BLKFINISHZONE  _IOW(0x12,135,unsigned long long)
> +
> +#endif /* _UAPI_BLKZONED_H */
> diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
> index 3b00f7c..1db6d66 100644
> --- a/include/uapi/linux/fs.h
> +++ b/include/uapi/linux/fs.h
> @@ -222,6 +222,7 @@ struct fsxattr {
>  #define BLKSECDISCARD _IO(0x12,125)
>  #define BLKROTATIONAL _IO(0x12,126)
>  #define BLKZEROOUT _IO(0x12,127)
> +/* A jump here: 130-135 are used for zoned block devices (see uapi/linux/blkzoned.h) */
>
>  #define BMAP_IOCTL 1           /* obsolete - kept for compatibility */
>  #define FIBMAP    _IO(0x00,1)  /* bmap access */
> --
> 2.7.4
>
> Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice & Disclaimer:
>
> This e-mail and any files transmitted with it may contain confidential or legally privileged information of WDC and/or its affiliates, and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system.
>



-- 
Shaun Tancheff

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 9/9] blk-zoned: Add ioctl interface for zone operations
  2016-09-19 21:27   ` Damien Le Moal
@ 2016-09-20  6:33     ` kbuild test robot
  -1 siblings, 0 replies; 36+ messages in thread
From: kbuild test robot @ 2016-09-20  6:33 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: kbuild-all, linux-scsi, linux-block, martin.petersen, axboe,
	hare, shaun.tancheff, Damien Le Moal

[-- Attachment #1: Type: text/plain, Size: 3551 bytes --]

Hi Shaun,

[auto build test ERROR on linus/master]
[also build test ERROR on v4.8-rc7]
[cannot apply to block/for-next next-20160919]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
[Suggest to use git(>=2.9.0) format-patch --base=<commit> (or --base=auto for convenience) to record what (public, well-known) commit your patch series was built on]
[Check https://git-scm.com/docs/git-format-patch for more information]

url:    https://github.com/0day-ci/linux/commits/Damien-Le-Moal/ZBC-Zoned-block-device-support/20160920-062608
config: m32r-allyesconfig (attached as .config)
compiler: m32r-linux-gcc (GCC) 6.2.0
reproduce:
        wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=m32r 

All errors (new ones prefixed by >>):

   block/built-in.o: In function `blkdev_zone_ioctl':
>> (.text+0x394c0): undefined reference to `__get_user_bad'
   block/built-in.o: In function `blkdev_zone_ioctl':
   (.text+0x394c0): relocation truncated to fit: R_M32R_26_PCREL_RELA against undefined symbol `__get_user_bad'
   drivers/built-in.o: In function `nvme_nvm_dev_dma_free':
   lightnvm.c:(.text+0x286ae4): undefined reference to `dma_pool_free'
   lightnvm.c:(.text+0x286ae4): relocation truncated to fit: R_M32R_26_PCREL_RELA against undefined symbol `dma_pool_free'
   drivers/built-in.o: In function `nvme_nvm_dev_dma_alloc':
   lightnvm.c:(.text+0x286afc): undefined reference to `dma_pool_alloc'
   lightnvm.c:(.text+0x286afc): relocation truncated to fit: R_M32R_26_PCREL_RELA against undefined symbol `dma_pool_alloc'
   drivers/built-in.o: In function `nvme_nvm_destroy_dma_pool':
   lightnvm.c:(.text+0x286b10): undefined reference to `dma_pool_destroy'
   lightnvm.c:(.text+0x286b10): relocation truncated to fit: R_M32R_26_PCREL_RELA against undefined symbol `dma_pool_destroy'
   drivers/built-in.o: In function `nvme_nvm_create_dma_pool':
   lightnvm.c:(.text+0x286b44): undefined reference to `dma_pool_create'
   lightnvm.c:(.text+0x286b44): relocation truncated to fit: R_M32R_26_PCREL_RELA against undefined symbol `dma_pool_create'
   sound/built-in.o: In function `snd_pcm_lib_default_mmap':
   (.text+0xfcdc): undefined reference to `bad_dma_ops'
   sound/built-in.o: In function `snd_pcm_lib_default_mmap':
   (.text+0xfce0): undefined reference to `bad_dma_ops'
   sound/built-in.o: In function `snd_pcm_lib_default_mmap':
   (.text+0xfd30): undefined reference to `dma_common_mmap'
   sound/built-in.o: In function `snd_pcm_lib_default_mmap':
   (.text+0xfd30): relocation truncated to fit: R_M32R_26_PCREL_RELA against undefined symbol `dma_common_mmap'
   sound/built-in.o: In function `cygnus_pcm_preallocate_dma_buffer':
   cygnus-pcm.c:(.text+0x1100dc): undefined reference to `bad_dma_ops'
   cygnus-pcm.c:(.text+0x1100e0): undefined reference to `bad_dma_ops'
   cygnus-pcm.c:(.text+0x110114): undefined reference to `bad_dma_ops'
   sound/built-in.o: In function `cygnus_dma_free_dma_buffers':
   cygnus-pcm.c:(.text+0x110214): undefined reference to `bad_dma_ops'
   cygnus-pcm.c:(.text+0x11021c): undefined reference to `bad_dma_ops'
   sound/built-in.o:cygnus-pcm.c:(.text+0x1102b4): more undefined references to `bad_dma_ops' follow

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 36365 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 9/9] blk-zoned: Add ioctl interface for zone operations
@ 2016-09-20  6:33     ` kbuild test robot
  0 siblings, 0 replies; 36+ messages in thread
From: kbuild test robot @ 2016-09-20  6:33 UTC (permalink / raw)
  Cc: kbuild-all, linux-scsi, linux-block, martin.petersen, axboe,
	hare, shaun.tancheff, Damien Le Moal

[-- Attachment #1: Type: text/plain, Size: 3551 bytes --]

Hi Shaun,

[auto build test ERROR on linus/master]
[also build test ERROR on v4.8-rc7]
[cannot apply to block/for-next next-20160919]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
[Suggest to use git(>=2.9.0) format-patch --base=<commit> (or --base=auto for convenience) to record what (public, well-known) commit your patch series was built on]
[Check https://git-scm.com/docs/git-format-patch for more information]

url:    https://github.com/0day-ci/linux/commits/Damien-Le-Moal/ZBC-Zoned-block-device-support/20160920-062608
config: m32r-allyesconfig (attached as .config)
compiler: m32r-linux-gcc (GCC) 6.2.0
reproduce:
        wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=m32r 

All errors (new ones prefixed by >>):

   block/built-in.o: In function `blkdev_zone_ioctl':
>> (.text+0x394c0): undefined reference to `__get_user_bad'
   block/built-in.o: In function `blkdev_zone_ioctl':
   (.text+0x394c0): relocation truncated to fit: R_M32R_26_PCREL_RELA against undefined symbol `__get_user_bad'
   drivers/built-in.o: In function `nvme_nvm_dev_dma_free':
   lightnvm.c:(.text+0x286ae4): undefined reference to `dma_pool_free'
   lightnvm.c:(.text+0x286ae4): relocation truncated to fit: R_M32R_26_PCREL_RELA against undefined symbol `dma_pool_free'
   drivers/built-in.o: In function `nvme_nvm_dev_dma_alloc':
   lightnvm.c:(.text+0x286afc): undefined reference to `dma_pool_alloc'
   lightnvm.c:(.text+0x286afc): relocation truncated to fit: R_M32R_26_PCREL_RELA against undefined symbol `dma_pool_alloc'
   drivers/built-in.o: In function `nvme_nvm_destroy_dma_pool':
   lightnvm.c:(.text+0x286b10): undefined reference to `dma_pool_destroy'
   lightnvm.c:(.text+0x286b10): relocation truncated to fit: R_M32R_26_PCREL_RELA against undefined symbol `dma_pool_destroy'
   drivers/built-in.o: In function `nvme_nvm_create_dma_pool':
   lightnvm.c:(.text+0x286b44): undefined reference to `dma_pool_create'
   lightnvm.c:(.text+0x286b44): relocation truncated to fit: R_M32R_26_PCREL_RELA against undefined symbol `dma_pool_create'
   sound/built-in.o: In function `snd_pcm_lib_default_mmap':
   (.text+0xfcdc): undefined reference to `bad_dma_ops'
   sound/built-in.o: In function `snd_pcm_lib_default_mmap':
   (.text+0xfce0): undefined reference to `bad_dma_ops'
   sound/built-in.o: In function `snd_pcm_lib_default_mmap':
   (.text+0xfd30): undefined reference to `dma_common_mmap'
   sound/built-in.o: In function `snd_pcm_lib_default_mmap':
   (.text+0xfd30): relocation truncated to fit: R_M32R_26_PCREL_RELA against undefined symbol `dma_common_mmap'
   sound/built-in.o: In function `cygnus_pcm_preallocate_dma_buffer':
   cygnus-pcm.c:(.text+0x1100dc): undefined reference to `bad_dma_ops'
   cygnus-pcm.c:(.text+0x1100e0): undefined reference to `bad_dma_ops'
   cygnus-pcm.c:(.text+0x110114): undefined reference to `bad_dma_ops'
   sound/built-in.o: In function `cygnus_dma_free_dma_buffers':
   cygnus-pcm.c:(.text+0x110214): undefined reference to `bad_dma_ops'
   cygnus-pcm.c:(.text+0x11021c): undefined reference to `bad_dma_ops'
   sound/built-in.o:cygnus-pcm.c:(.text+0x1102b4): more undefined references to `bad_dma_ops' follow

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 36365 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2016-09-20  6:33 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-19 21:27 [PATCH 0/9] ZBC / Zoned block device support Damien Le Moal
2016-09-19 21:27 ` Damien Le Moal
2016-09-19 21:27 ` [PATCH 1/9] block: Add 'zoned' queue limit Damien Le Moal
2016-09-19 21:27   ` Damien Le Moal
2016-09-20  4:05   ` Bart Van Assche
2016-09-20  4:05     ` Bart Van Assche
2016-09-19 21:27 ` [PATCH 2/9] blk-sysfs: Add 'chunk_sectors' to sysfs attributes Damien Le Moal
2016-09-19 21:27   ` Damien Le Moal
2016-09-19 21:27 ` [PATCH 3/9] block: update chunk_sectors in blk_stack_limits() Damien Le Moal
2016-09-19 21:27   ` Damien Le Moal
2016-09-19 21:27 ` [PATCH 4/9] block: Define zoned block device operations Damien Le Moal
2016-09-19 21:27   ` Damien Le Moal
2016-09-20  4:05   ` Bart Van Assche
2016-09-20  4:05     ` Bart Van Assche
2016-09-19 21:27 ` [PATCH 5/9] block: Implement support for zoned block devices Damien Le Moal
2016-09-19 21:27   ` Damien Le Moal
2016-09-20  4:18   ` Bart Van Assche
2016-09-20  4:18     ` Bart Van Assche
2016-09-19 21:27 ` [PATCH 6/9] block: Add 'BLKPREP_DONE' return value Damien Le Moal
2016-09-19 21:27   ` Damien Le Moal
2016-09-19 21:27 ` [PATCH 7/9] block: Add 'BLK_MQ_RQ_QUEUE_DONE' " Damien Le Moal
2016-09-19 21:27   ` Damien Le Moal
2016-09-19 21:27 ` [PATCH 8/9] sd: Implement support for ZBC devices Damien Le Moal
2016-09-19 21:27   ` Damien Le Moal
2016-09-20  0:08   ` kbuild test robot
2016-09-20  0:08     ` kbuild test robot
2016-09-20  5:40   ` Shaun Tancheff
2016-09-20  5:40     ` Shaun Tancheff
2016-09-19 21:27 ` [PATCH 9/9] blk-zoned: Add ioctl interface for zone operations Damien Le Moal
2016-09-19 21:27   ` Damien Le Moal
2016-09-20  2:39   ` kbuild test robot
2016-09-20  2:39     ` kbuild test robot
2016-09-20  6:02   ` Shaun Tancheff
2016-09-20  6:02     ` Shaun Tancheff
2016-09-20  6:33   ` kbuild test robot
2016-09-20  6:33     ` kbuild test robot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.