* [PATCH 0/9] ZBC / Zoned block device support
@ 2016-09-19 21:27 ` Damien Le Moal
0 siblings, 0 replies; 36+ messages in thread
From: Damien Le Moal @ 2016-09-19 21:27 UTC (permalink / raw)
To: linux-scsi, linux-block
Cc: martin.petersen, axboe, hare, shaun.tancheff, Damien Le Moal
This series introduces support for ZBC zoned block devices. It integrates
earlier submissions by Hannes Reinecke and Shaun Tancheff and includes
rewrites and corrections suggested by Christoph Hellwig.
For zoned block devices, a zone information cache implemented as an RB-tree
is attached to the device request queue and maintained from the SCSI disk
driver layer. The cache is used to check read and write commands alignement
to zone position and to the write pointer position within zones.
The generic block layer API defines functions for obtaining zone information
and manipulating zones. Operation on zones are defined as request operations
which triger zone cache changes when processed in the sd driver.
Most of the ZBC specific code is kept out of sd.c and implemented in the
new file sd_zbc.c. Similarly, at the block layer, most of the zoned block
device code is implemented in the new blk-zoned.c.
For host-managed zoned block devices, the sequential write constraint of
write pointer zones is exposed to the user. Users of the disk (applications,
file systems or device mappers) must sequentially write to zones. This means
that for raw block device accesses from applications, buffered writes are
unreliable and direct I/Os must be used (or buffered writes with O_SYNC).
At the SCSI layer, write ordering is maintained at dispatch time for both
the simple queue model and scsi-mq model using a zone write lock. This lock,
implemented as a flag in zone information, prevents issuing multiple writes
to a single zone, in effect, resulting in write queue depth of 1 per zone
while allowing the drive to be operated at high queue depth overall.
Access to zone manipulation operations is also provided to applications
through a set of new ioctls. This allows applications operating on raw
block devices (e.g. mkfs.xxx) to discover a device zone layout and
manipulate zone state.
Damien Le Moal (1):
block: Add 'zoned' queue limit
Hannes Reinecke (6):
blk-sysfs: Add 'chunk_sectors' to sysfs attributes
block: update chunk_sectors in blk_stack_limits()
block: Implement support for zoned block devices
block: Add 'BLKPREP_DONE' return value
block: Add 'BLK_MQ_RQ_QUEUE_DONE' return value
sd: Implement support for ZBC devices
Shaun Tancheff (2):
block: Define zoned block device operations
blk-zoned: Add ioctl interface for zone operations
block/Kconfig | 8 +
block/Makefile | 1 +
block/blk-core.c | 53 +-
block/blk-merge.c | 31 +-
block/blk-mq.c | 1 +
block/blk-settings.c | 5 +
block/blk-sysfs.c | 29 ++
block/blk-zoned.c | 453 +++++++++++++++++
block/ioctl.c | 8 +
drivers/scsi/Makefile | 1 +
drivers/scsi/scsi_lib.c | 4 +
drivers/scsi/sd.c | 147 +++++-
drivers/scsi/sd.h | 68 +++
drivers/scsi/sd_zbc.c | 1097 +++++++++++++++++++++++++++++++++++++++++
include/linux/bio.h | 36 +-
include/linux/blk-mq.h | 1 +
include/linux/blk_types.h | 27 +-
include/linux/blkdev.h | 146 ++++++
include/scsi/scsi_proto.h | 17 +
include/uapi/linux/Kbuild | 1 +
include/uapi/linux/blkzoned.h | 91 ++++
include/uapi/linux/fs.h | 1 +
22 files changed, 2170 insertions(+), 56 deletions(-)
create mode 100644 block/blk-zoned.c
create mode 100644 drivers/scsi/sd_zbc.c
create mode 100644 include/uapi/linux/blkzoned.h
--
2.7.4
Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice & Disclaimer:
This e-mail and any files transmitted with it may contain confidential or legally privileged information of WDC and/or its affiliates, and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system.
^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH 0/9] ZBC / Zoned block device support
@ 2016-09-19 21:27 ` Damien Le Moal
0 siblings, 0 replies; 36+ messages in thread
From: Damien Le Moal @ 2016-09-19 21:27 UTC (permalink / raw)
To: linux-scsi, linux-block
Cc: martin.petersen, axboe, hare, shaun.tancheff, Damien Le Moal
This series introduces support for ZBC zoned block devices. It integrates
earlier submissions by Hannes Reinecke and Shaun Tancheff and includes
rewrites and corrections suggested by Christoph Hellwig.
For zoned block devices, a zone information cache implemented as an RB-tree
is attached to the device request queue and maintained from the SCSI disk
driver layer. The cache is used to check read and write commands alignement
to zone position and to the write pointer position within zones.
The generic block layer API defines functions for obtaining zone information
and manipulating zones. Operation on zones are defined as request operations
which triger zone cache changes when processed in the sd driver.
Most of the ZBC specific code is kept out of sd.c and implemented in the
new file sd_zbc.c. Similarly, at the block layer, most of the zoned block
device code is implemented in the new blk-zoned.c.
For host-managed zoned block devices, the sequential write constraint of
write pointer zones is exposed to the user. Users of the disk (applications,
file systems or device mappers) must sequentially write to zones. This means
that for raw block device accesses from applications, buffered writes are
unreliable and direct I/Os must be used (or buffered writes with O_SYNC).
At the SCSI layer, write ordering is maintained at dispatch time for both
the simple queue model and scsi-mq model using a zone write lock. This lock,
implemented as a flag in zone information, prevents issuing multiple writes
to a single zone, in effect, resulting in write queue depth of 1 per zone
while allowing the drive to be operated at high queue depth overall.
Access to zone manipulation operations is also provided to applications
through a set of new ioctls. This allows applications operating on raw
block devices (e.g. mkfs.xxx) to discover a device zone layout and
manipulate zone state.
Damien Le Moal (1):
block: Add 'zoned' queue limit
Hannes Reinecke (6):
blk-sysfs: Add 'chunk_sectors' to sysfs attributes
block: update chunk_sectors in blk_stack_limits()
block: Implement support for zoned block devices
block: Add 'BLKPREP_DONE' return value
block: Add 'BLK_MQ_RQ_QUEUE_DONE' return value
sd: Implement support for ZBC devices
Shaun Tancheff (2):
block: Define zoned block device operations
blk-zoned: Add ioctl interface for zone operations
block/Kconfig | 8 +
block/Makefile | 1 +
block/blk-core.c | 53 +-
block/blk-merge.c | 31 +-
block/blk-mq.c | 1 +
block/blk-settings.c | 5 +
block/blk-sysfs.c | 29 ++
block/blk-zoned.c | 453 +++++++++++++++++
block/ioctl.c | 8 +
drivers/scsi/Makefile | 1 +
drivers/scsi/scsi_lib.c | 4 +
drivers/scsi/sd.c | 147 +++++-
drivers/scsi/sd.h | 68 +++
drivers/scsi/sd_zbc.c | 1097 +++++++++++++++++++++++++++++++++++++++++
include/linux/bio.h | 36 +-
include/linux/blk-mq.h | 1 +
include/linux/blk_types.h | 27 +-
include/linux/blkdev.h | 146 ++++++
include/scsi/scsi_proto.h | 17 +
include/uapi/linux/Kbuild | 1 +
include/uapi/linux/blkzoned.h | 91 ++++
include/uapi/linux/fs.h | 1 +
22 files changed, 2170 insertions(+), 56 deletions(-)
create mode 100644 block/blk-zoned.c
create mode 100644 drivers/scsi/sd_zbc.c
create mode 100644 include/uapi/linux/blkzoned.h
--
2.7.4
^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH 1/9] block: Add 'zoned' queue limit
2016-09-19 21:27 ` Damien Le Moal
@ 2016-09-19 21:27 ` Damien Le Moal
-1 siblings, 0 replies; 36+ messages in thread
From: Damien Le Moal @ 2016-09-19 21:27 UTC (permalink / raw)
To: linux-scsi, linux-block
Cc: martin.petersen, axboe, hare, shaun.tancheff, Damien Le Moal
Add the zoned queue limit to indicate the zoning model of a block
device. Defined values are 0 (BLK_ZONED_NONE) for regular block
devices, 1 (BLK_ZONED_HA) for host-aware zone block devices and 2
(BLK_ZONED_HM) for host-managed zone block devices. The drive managed
model is not defined here since these block devices do not provide any
command for accessing zone information. The helper functions
blk_queue_zoned and bdev_zoned return the zoned limit which can in turn
be used as a boolean to test if a block device is zoned.
The zoned attribute is also exported as a string to applications via
sysfs. BLK_ZONED_NONE shows as "none", BLK_ZONED_HA as "host-aware" and
BLK_ZONED_HM as "host-managed".
Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
---
block/blk-settings.c | 1 +
block/blk-sysfs.c | 18 ++++++++++++++++++
include/linux/blkdev.h | 25 +++++++++++++++++++++++++
3 files changed, 44 insertions(+)
diff --git a/block/blk-settings.c b/block/blk-settings.c
index f679ae1..b1d5b7f 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -107,6 +107,7 @@ void blk_set_default_limits(struct queue_limits *lim)
lim->io_opt = 0;
lim->misaligned = 0;
lim->cluster = 1;
+ lim->zoned = BLK_ZONED_NONE;
}
EXPORT_SYMBOL(blk_set_default_limits);
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index f87a7e7..31ecff9 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -257,6 +257,18 @@ QUEUE_SYSFS_BIT_FNS(random, ADD_RANDOM, 0);
QUEUE_SYSFS_BIT_FNS(iostats, IO_STAT, 0);
#undef QUEUE_SYSFS_BIT_FNS
+static ssize_t queue_zoned_show(struct request_queue *q, char *page)
+{
+ switch (blk_queue_zoned(q)) {
+ case BLK_ZONED_HA:
+ return sprintf(page, "host-aware\n");
+ case BLK_ZONED_HM:
+ return sprintf(page, "host-managed\n");
+ default:
+ return sprintf(page, "none\n");
+ }
+}
+
static ssize_t queue_nomerges_show(struct request_queue *q, char *page)
{
return queue_var_show((blk_queue_nomerges(q) << 1) |
@@ -485,6 +497,11 @@ static struct queue_sysfs_entry queue_nonrot_entry = {
.store = queue_store_nonrot,
};
+static struct queue_sysfs_entry queue_zoned_entry = {
+ .attr = {.name = "zoned", .mode = S_IRUGO },
+ .show = queue_zoned_show,
+};
+
static struct queue_sysfs_entry queue_nomerges_entry = {
.attr = {.name = "nomerges", .mode = S_IRUGO | S_IWUSR },
.show = queue_nomerges_show,
@@ -546,6 +563,7 @@ static struct attribute *default_attrs[] = {
&queue_discard_zeroes_data_entry.attr,
&queue_write_same_max_entry.attr,
&queue_nonrot_entry.attr,
+ &queue_zoned_entry.attr,
&queue_nomerges_entry.attr,
&queue_rq_affinity_entry.attr,
&queue_iostats_entry.attr,
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index e79055c..1c74b19 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -261,6 +261,15 @@ struct blk_queue_tag {
#define BLK_SCSI_MAX_CMDS (256)
#define BLK_SCSI_CMD_PER_LONG (BLK_SCSI_MAX_CMDS / (sizeof(long) * 8))
+/*
+ * Zoned block device models (zoned limit).
+ */
+enum blk_zoned_model {
+ BLK_ZONED_NONE, /* Regular block device */
+ BLK_ZONED_HA, /* Host-aware zoned block device */
+ BLK_ZONED_HM, /* Host-managed zoned block device */
+};
+
struct queue_limits {
unsigned long bounce_pfn;
unsigned long seg_boundary_mask;
@@ -290,6 +299,7 @@ struct queue_limits {
unsigned char cluster;
unsigned char discard_zeroes_data;
unsigned char raid_partial_stripes_expensive;
+ unsigned char zoned;
};
struct request_queue {
@@ -627,6 +637,11 @@ static inline unsigned int blk_queue_cluster(struct request_queue *q)
return q->limits.cluster;
}
+static inline unsigned int blk_queue_zoned(struct request_queue *q)
+{
+ return q->limits.zoned;
+}
+
/*
* We regard a request as sync, if either a read or a sync write
*/
@@ -1354,6 +1369,16 @@ static inline unsigned int bdev_write_same(struct block_device *bdev)
return 0;
}
+static inline unsigned int bdev_zoned(struct block_device *bdev)
+{
+ struct request_queue *q = bdev_get_queue(bdev);
+
+ if (q)
+ return blk_queue_zoned(q);
+
+ return 0;
+}
+
static inline int queue_dma_alignment(struct request_queue *q)
{
return q ? q->dma_alignment : 511;
--
2.7.4
Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice & Disclaimer:
This e-mail and any files transmitted with it may contain confidential or legally privileged information of WDC and/or its affiliates, and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system.
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 1/9] block: Add 'zoned' queue limit
@ 2016-09-19 21:27 ` Damien Le Moal
0 siblings, 0 replies; 36+ messages in thread
From: Damien Le Moal @ 2016-09-19 21:27 UTC (permalink / raw)
To: linux-scsi, linux-block
Cc: martin.petersen, axboe, hare, shaun.tancheff, Damien Le Moal
Add the zoned queue limit to indicate the zoning model of a block
device. Defined values are 0 (BLK_ZONED_NONE) for regular block
devices, 1 (BLK_ZONED_HA) for host-aware zone block devices and 2
(BLK_ZONED_HM) for host-managed zone block devices. The drive managed
model is not defined here since these block devices do not provide any
command for accessing zone information. The helper functions
blk_queue_zoned and bdev_zoned return the zoned limit which can in turn
be used as a boolean to test if a block device is zoned.
The zoned attribute is also exported as a string to applications via
sysfs. BLK_ZONED_NONE shows as "none", BLK_ZONED_HA as "host-aware" and
BLK_ZONED_HM as "host-managed".
Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
---
block/blk-settings.c | 1 +
block/blk-sysfs.c | 18 ++++++++++++++++++
include/linux/blkdev.h | 25 +++++++++++++++++++++++++
3 files changed, 44 insertions(+)
diff --git a/block/blk-settings.c b/block/blk-settings.c
index f679ae1..b1d5b7f 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -107,6 +107,7 @@ void blk_set_default_limits(struct queue_limits *lim)
lim->io_opt = 0;
lim->misaligned = 0;
lim->cluster = 1;
+ lim->zoned = BLK_ZONED_NONE;
}
EXPORT_SYMBOL(blk_set_default_limits);
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index f87a7e7..31ecff9 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -257,6 +257,18 @@ QUEUE_SYSFS_BIT_FNS(random, ADD_RANDOM, 0);
QUEUE_SYSFS_BIT_FNS(iostats, IO_STAT, 0);
#undef QUEUE_SYSFS_BIT_FNS
+static ssize_t queue_zoned_show(struct request_queue *q, char *page)
+{
+ switch (blk_queue_zoned(q)) {
+ case BLK_ZONED_HA:
+ return sprintf(page, "host-aware\n");
+ case BLK_ZONED_HM:
+ return sprintf(page, "host-managed\n");
+ default:
+ return sprintf(page, "none\n");
+ }
+}
+
static ssize_t queue_nomerges_show(struct request_queue *q, char *page)
{
return queue_var_show((blk_queue_nomerges(q) << 1) |
@@ -485,6 +497,11 @@ static struct queue_sysfs_entry queue_nonrot_entry = {
.store = queue_store_nonrot,
};
+static struct queue_sysfs_entry queue_zoned_entry = {
+ .attr = {.name = "zoned", .mode = S_IRUGO },
+ .show = queue_zoned_show,
+};
+
static struct queue_sysfs_entry queue_nomerges_entry = {
.attr = {.name = "nomerges", .mode = S_IRUGO | S_IWUSR },
.show = queue_nomerges_show,
@@ -546,6 +563,7 @@ static struct attribute *default_attrs[] = {
&queue_discard_zeroes_data_entry.attr,
&queue_write_same_max_entry.attr,
&queue_nonrot_entry.attr,
+ &queue_zoned_entry.attr,
&queue_nomerges_entry.attr,
&queue_rq_affinity_entry.attr,
&queue_iostats_entry.attr,
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index e79055c..1c74b19 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -261,6 +261,15 @@ struct blk_queue_tag {
#define BLK_SCSI_MAX_CMDS (256)
#define BLK_SCSI_CMD_PER_LONG (BLK_SCSI_MAX_CMDS / (sizeof(long) * 8))
+/*
+ * Zoned block device models (zoned limit).
+ */
+enum blk_zoned_model {
+ BLK_ZONED_NONE, /* Regular block device */
+ BLK_ZONED_HA, /* Host-aware zoned block device */
+ BLK_ZONED_HM, /* Host-managed zoned block device */
+};
+
struct queue_limits {
unsigned long bounce_pfn;
unsigned long seg_boundary_mask;
@@ -290,6 +299,7 @@ struct queue_limits {
unsigned char cluster;
unsigned char discard_zeroes_data;
unsigned char raid_partial_stripes_expensive;
+ unsigned char zoned;
};
struct request_queue {
@@ -627,6 +637,11 @@ static inline unsigned int blk_queue_cluster(struct request_queue *q)
return q->limits.cluster;
}
+static inline unsigned int blk_queue_zoned(struct request_queue *q)
+{
+ return q->limits.zoned;
+}
+
/*
* We regard a request as sync, if either a read or a sync write
*/
@@ -1354,6 +1369,16 @@ static inline unsigned int bdev_write_same(struct block_device *bdev)
return 0;
}
+static inline unsigned int bdev_zoned(struct block_device *bdev)
+{
+ struct request_queue *q = bdev_get_queue(bdev);
+
+ if (q)
+ return blk_queue_zoned(q);
+
+ return 0;
+}
+
static inline int queue_dma_alignment(struct request_queue *q)
{
return q ? q->dma_alignment : 511;
--
2.7.4
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 2/9] blk-sysfs: Add 'chunk_sectors' to sysfs attributes
2016-09-19 21:27 ` Damien Le Moal
@ 2016-09-19 21:27 ` Damien Le Moal
-1 siblings, 0 replies; 36+ messages in thread
From: Damien Le Moal @ 2016-09-19 21:27 UTC (permalink / raw)
To: linux-scsi, linux-block
Cc: martin.petersen, axboe, hare, shaun.tancheff, Damien Le Moal
From: Hannes Reinecke <hare@suse.de>
The queue limits already have a 'chunk_sectors' setting, so
we should be presenting it via sysfs.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
---
block/blk-sysfs.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index 31ecff9..15e5baf 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -130,6 +130,11 @@ static ssize_t queue_physical_block_size_show(struct request_queue *q, char *pag
return queue_var_show(queue_physical_block_size(q), page);
}
+static ssize_t queue_chunk_sectors_show(struct request_queue *q, char *page)
+{
+ return queue_var_show(q->limits.chunk_sectors, page);
+}
+
static ssize_t queue_io_min_show(struct request_queue *q, char *page)
{
return queue_var_show(queue_io_min(q), page);
@@ -455,6 +460,11 @@ static struct queue_sysfs_entry queue_physical_block_size_entry = {
.show = queue_physical_block_size_show,
};
+static struct queue_sysfs_entry queue_chunk_sectors_entry = {
+ .attr = {.name = "chunk_sectors", .mode = S_IRUGO },
+ .show = queue_chunk_sectors_show,
+};
+
static struct queue_sysfs_entry queue_io_min_entry = {
.attr = {.name = "minimum_io_size", .mode = S_IRUGO },
.show = queue_io_min_show,
@@ -555,6 +565,7 @@ static struct attribute *default_attrs[] = {
&queue_hw_sector_size_entry.attr,
&queue_logical_block_size_entry.attr,
&queue_physical_block_size_entry.attr,
+ &queue_chunk_sectors_entry.attr,
&queue_io_min_entry.attr,
&queue_io_opt_entry.attr,
&queue_discard_granularity_entry.attr,
--
2.7.4
Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice & Disclaimer:
This e-mail and any files transmitted with it may contain confidential or legally privileged information of WDC and/or its affiliates, and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system.
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 2/9] blk-sysfs: Add 'chunk_sectors' to sysfs attributes
@ 2016-09-19 21:27 ` Damien Le Moal
0 siblings, 0 replies; 36+ messages in thread
From: Damien Le Moal @ 2016-09-19 21:27 UTC (permalink / raw)
To: linux-scsi, linux-block
Cc: martin.petersen, axboe, hare, shaun.tancheff, Damien Le Moal
From: Hannes Reinecke <hare@suse.de>
The queue limits already have a 'chunk_sectors' setting, so
we should be presenting it via sysfs.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
---
block/blk-sysfs.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index 31ecff9..15e5baf 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -130,6 +130,11 @@ static ssize_t queue_physical_block_size_show(struct request_queue *q, char *pag
return queue_var_show(queue_physical_block_size(q), page);
}
+static ssize_t queue_chunk_sectors_show(struct request_queue *q, char *page)
+{
+ return queue_var_show(q->limits.chunk_sectors, page);
+}
+
static ssize_t queue_io_min_show(struct request_queue *q, char *page)
{
return queue_var_show(queue_io_min(q), page);
@@ -455,6 +460,11 @@ static struct queue_sysfs_entry queue_physical_block_size_entry = {
.show = queue_physical_block_size_show,
};
+static struct queue_sysfs_entry queue_chunk_sectors_entry = {
+ .attr = {.name = "chunk_sectors", .mode = S_IRUGO },
+ .show = queue_chunk_sectors_show,
+};
+
static struct queue_sysfs_entry queue_io_min_entry = {
.attr = {.name = "minimum_io_size", .mode = S_IRUGO },
.show = queue_io_min_show,
@@ -555,6 +565,7 @@ static struct attribute *default_attrs[] = {
&queue_hw_sector_size_entry.attr,
&queue_logical_block_size_entry.attr,
&queue_physical_block_size_entry.attr,
+ &queue_chunk_sectors_entry.attr,
&queue_io_min_entry.attr,
&queue_io_opt_entry.attr,
&queue_discard_granularity_entry.attr,
--
2.7.4
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 3/9] block: update chunk_sectors in blk_stack_limits()
2016-09-19 21:27 ` Damien Le Moal
@ 2016-09-19 21:27 ` Damien Le Moal
-1 siblings, 0 replies; 36+ messages in thread
From: Damien Le Moal @ 2016-09-19 21:27 UTC (permalink / raw)
To: linux-scsi, linux-block
Cc: martin.petersen, axboe, hare, shaun.tancheff, Hannes Reinecke,
Damien Le Moal
From: Hannes Reinecke <hare@suse.de>
Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
---
block/blk-settings.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/block/blk-settings.c b/block/blk-settings.c
index b1d5b7f..55369a6 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -631,6 +631,10 @@ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b,
t->discard_granularity;
}
+ if (b->chunk_sectors)
+ t->chunk_sectors = min_not_zero(t->chunk_sectors,
+ b->chunk_sectors);
+
return ret;
}
EXPORT_SYMBOL(blk_stack_limits);
--
2.7.4
Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice & Disclaimer:
This e-mail and any files transmitted with it may contain confidential or legally privileged information of WDC and/or its affiliates, and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system.
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 3/9] block: update chunk_sectors in blk_stack_limits()
@ 2016-09-19 21:27 ` Damien Le Moal
0 siblings, 0 replies; 36+ messages in thread
From: Damien Le Moal @ 2016-09-19 21:27 UTC (permalink / raw)
To: linux-scsi, linux-block
Cc: martin.petersen, axboe, hare, shaun.tancheff, Hannes Reinecke,
Damien Le Moal
From: Hannes Reinecke <hare@suse.de>
Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
---
block/blk-settings.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/block/blk-settings.c b/block/blk-settings.c
index b1d5b7f..55369a6 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -631,6 +631,10 @@ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b,
t->discard_granularity;
}
+ if (b->chunk_sectors)
+ t->chunk_sectors = min_not_zero(t->chunk_sectors,
+ b->chunk_sectors);
+
return ret;
}
EXPORT_SYMBOL(blk_stack_limits);
--
2.7.4
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 4/9] block: Define zoned block device operations
2016-09-19 21:27 ` Damien Le Moal
@ 2016-09-19 21:27 ` Damien Le Moal
-1 siblings, 0 replies; 36+ messages in thread
From: Damien Le Moal @ 2016-09-19 21:27 UTC (permalink / raw)
To: linux-scsi, linux-block
Cc: martin.petersen, axboe, hare, shaun.tancheff, Damien Le Moal
From: Shaun Tancheff <shaun.tancheff@seagate.com>
Define REQ_OP_ZONE_REPORT, REQ_OP_ZONE_RESET, REQ_OP_ZONE_OPEN,
REQ_OP_ZONE_CLOSE and REQ_OP_ZONE_FINISH for handling zones of
zoned block devices (host-managed and host-aware). With with these
new commands, the total number of operations defined reaches 11 and
requires increasing REQ_OP_BITS from 3 to 4.
Signed-off-by: Shaun Tancheff <shaun.tancheff@seagate.com>
Changelog (Damien):
All requests have no payload and may operate on all zones of the
device (when the BIO sector and size are 0) or on a single zone
(when the BIO sector and size are aigned on a zone).
REQ_OP_ZONE_REPORT is not sent directly to the device
and is processed in sd_zbc.c using the device zone work
in order to parse the report reply and manage changes to
the zone information cache of the device.
Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
---
block/blk-core.c | 7 +++++++
block/blk-merge.c | 31 +++++++++++++++++++++++++++----
include/linux/bio.h | 36 +++++++++++++++++++++++++++---------
include/linux/blk_types.h | 27 ++++++++++++++++++++++++++-
4 files changed, 87 insertions(+), 14 deletions(-)
diff --git a/block/blk-core.c b/block/blk-core.c
index 36c7ac3..4a7f7ba 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1941,6 +1941,13 @@ generic_make_request_checks(struct bio *bio)
case REQ_OP_WRITE_SAME:
if (!bdev_write_same(bio->bi_bdev))
goto not_supported;
+ case REQ_OP_ZONE_REPORT:
+ case REQ_OP_ZONE_RESET:
+ case REQ_OP_ZONE_OPEN:
+ case REQ_OP_ZONE_CLOSE:
+ case REQ_OP_ZONE_FINISH:
+ if (!bdev_zoned(bio->bi_bdev))
+ goto not_supported;
break;
default:
break;
diff --git a/block/blk-merge.c b/block/blk-merge.c
index 2642e5f..f9299df 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -202,6 +202,21 @@ void blk_queue_split(struct request_queue *q, struct bio **bio,
case REQ_OP_WRITE_SAME:
split = blk_bio_write_same_split(q, *bio, bs, &nsegs);
break;
+ case REQ_OP_ZONE_REPORT:
+ case REQ_OP_ZONE_RESET:
+ case REQ_OP_ZONE_OPEN:
+ case REQ_OP_ZONE_CLOSE:
+ case REQ_OP_ZONE_FINISH:
+ /*
+ * For these commands, bi_size is either 0 to specify
+ * operation on the entire block device sector range,
+ * or a zone size for operation on a single zone.
+ * Since a zone size may be much bigger than the maximum
+ * allowed BIO size, we cannot use blk_bio_segment_split.
+ */
+ split = NULL;
+ nsegs = 0;
+ break;
default:
split = blk_bio_segment_split(q, *bio, q->bio_split, &nsegs);
break;
@@ -241,11 +256,19 @@ static unsigned int __blk_recalc_rq_segments(struct request_queue *q,
* This should probably be returning 0, but blk_add_request_payload()
* (Christoph!!!!)
*/
- if (bio_op(bio) == REQ_OP_DISCARD || bio_op(bio) == REQ_OP_SECURE_ERASE)
- return 1;
-
- if (bio_op(bio) == REQ_OP_WRITE_SAME)
+ switch(bio_op(bio)) {
+ case REQ_OP_DISCARD:
+ case REQ_OP_SECURE_ERASE:
+ case REQ_OP_WRITE_SAME:
+ case REQ_OP_ZONE_REPORT:
+ case REQ_OP_ZONE_RESET:
+ case REQ_OP_ZONE_OPEN:
+ case REQ_OP_ZONE_CLOSE:
+ case REQ_OP_ZONE_FINISH:
return 1;
+ default:
+ break;
+ }
fbio = bio;
cluster = blk_queue_cluster(q);
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 23ddf4b..d9c2e21 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -69,20 +69,38 @@
*/
static inline bool bio_has_data(struct bio *bio)
{
- if (bio &&
- bio->bi_iter.bi_size &&
- bio_op(bio) != REQ_OP_DISCARD &&
- bio_op(bio) != REQ_OP_SECURE_ERASE)
- return true;
+ if (!bio || !bio->bi_iter.bi_size)
+ return false;
- return false;
+ switch (bio_op(bio)) {
+ case REQ_OP_DISCARD:
+ case REQ_OP_SECURE_ERASE:
+ case REQ_OP_ZONE_REPORT:
+ case REQ_OP_ZONE_RESET:
+ case REQ_OP_ZONE_OPEN:
+ case REQ_OP_ZONE_CLOSE:
+ case REQ_OP_ZONE_FINISH:
+ return false;
+ default:
+ return true;
+ }
}
static inline bool bio_no_advance_iter(struct bio *bio)
{
- return bio_op(bio) == REQ_OP_DISCARD ||
- bio_op(bio) == REQ_OP_SECURE_ERASE ||
- bio_op(bio) == REQ_OP_WRITE_SAME;
+ switch (bio_op(bio)) {
+ case REQ_OP_DISCARD:
+ case REQ_OP_SECURE_ERASE:
+ case REQ_OP_WRITE_SAME:
+ case REQ_OP_ZONE_REPORT:
+ case REQ_OP_ZONE_RESET:
+ case REQ_OP_ZONE_OPEN:
+ case REQ_OP_ZONE_CLOSE:
+ case REQ_OP_ZONE_FINISH:
+ return true;
+ default:
+ return false;
+ }
}
static inline bool bio_is_rw(struct bio *bio)
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index 436f43f..70df996 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -229,6 +229,26 @@ enum rq_flag_bits {
#define REQ_HASHED (1ULL << __REQ_HASHED)
#define REQ_MQ_INFLIGHT (1ULL << __REQ_MQ_INFLIGHT)
+/*
+ * Note on zone operations:
+ * All REQ_OP_ZONE_* commands do not have a payload and share a common
+ * interface for specifying operation range:
+ * (1) bio->bi_iter.bi_sector and bio->bi_iter.bi_size set to 0:
+ * the command is to operate on ALL zones of the device.
+ * (2) bio->bi_iter.bi_sector is set to a zone start sector and
+ * bio->bi_iter.bi_size is set to the zone size in bytes:
+ * the command is to operate on only the specified zone.
+ * Operation:
+ * REQ_OP_ZONE_REPORT: Request information for all zones or for a single zone.
+ * REQ_OP_ZONE_RESET: Reset the write pointer of all zones or of a single zone.
+ * REQ_OP_ZONE_OPEN: Explicitely open the maximum allowed number of zones or
+ * a single zone. For the former case, the zones that will
+ * actually be open are chosen by the disk.
+ * REQ_OP_ZONE_CLOSE: Close all implicitely or explicitely open zones or
+ * a single zone.
+ * REQ_OP_ZONE_FINISH: Transition one or all open and closed zones to the full
+ * condition.
+ */
enum req_op {
REQ_OP_READ,
REQ_OP_WRITE,
@@ -236,9 +256,14 @@ enum req_op {
REQ_OP_SECURE_ERASE, /* request to securely erase sectors */
REQ_OP_WRITE_SAME, /* write same block many times */
REQ_OP_FLUSH, /* request for cache flush */
+ REQ_OP_ZONE_REPORT, /* Get zone information */
+ REQ_OP_ZONE_RESET, /* Reset a zone write pointer */
+ REQ_OP_ZONE_OPEN, /* Explicitely open a zone */
+ REQ_OP_ZONE_CLOSE, /* Close an open zone */
+ REQ_OP_ZONE_FINISH, /* Finish a zone */
};
-#define REQ_OP_BITS 3
+#define REQ_OP_BITS 4
typedef unsigned int blk_qc_t;
#define BLK_QC_T_NONE -1U
--
2.7.4
Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice & Disclaimer:
This e-mail and any files transmitted with it may contain confidential or legally privileged information of WDC and/or its affiliates, and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system.
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 4/9] block: Define zoned block device operations
@ 2016-09-19 21:27 ` Damien Le Moal
0 siblings, 0 replies; 36+ messages in thread
From: Damien Le Moal @ 2016-09-19 21:27 UTC (permalink / raw)
To: linux-scsi, linux-block
Cc: martin.petersen, axboe, hare, shaun.tancheff, Damien Le Moal
From: Shaun Tancheff <shaun.tancheff@seagate.com>
Define REQ_OP_ZONE_REPORT, REQ_OP_ZONE_RESET, REQ_OP_ZONE_OPEN,
REQ_OP_ZONE_CLOSE and REQ_OP_ZONE_FINISH for handling zones of
zoned block devices (host-managed and host-aware). With with these
new commands, the total number of operations defined reaches 11 and
requires increasing REQ_OP_BITS from 3 to 4.
Signed-off-by: Shaun Tancheff <shaun.tancheff@seagate.com>
Changelog (Damien):
All requests have no payload and may operate on all zones of the
device (when the BIO sector and size are 0) or on a single zone
(when the BIO sector and size are aigned on a zone).
REQ_OP_ZONE_REPORT is not sent directly to the device
and is processed in sd_zbc.c using the device zone work
in order to parse the report reply and manage changes to
the zone information cache of the device.
Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
---
block/blk-core.c | 7 +++++++
block/blk-merge.c | 31 +++++++++++++++++++++++++++----
include/linux/bio.h | 36 +++++++++++++++++++++++++++---------
include/linux/blk_types.h | 27 ++++++++++++++++++++++++++-
4 files changed, 87 insertions(+), 14 deletions(-)
diff --git a/block/blk-core.c b/block/blk-core.c
index 36c7ac3..4a7f7ba 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1941,6 +1941,13 @@ generic_make_request_checks(struct bio *bio)
case REQ_OP_WRITE_SAME:
if (!bdev_write_same(bio->bi_bdev))
goto not_supported;
+ case REQ_OP_ZONE_REPORT:
+ case REQ_OP_ZONE_RESET:
+ case REQ_OP_ZONE_OPEN:
+ case REQ_OP_ZONE_CLOSE:
+ case REQ_OP_ZONE_FINISH:
+ if (!bdev_zoned(bio->bi_bdev))
+ goto not_supported;
break;
default:
break;
diff --git a/block/blk-merge.c b/block/blk-merge.c
index 2642e5f..f9299df 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -202,6 +202,21 @@ void blk_queue_split(struct request_queue *q, struct bio **bio,
case REQ_OP_WRITE_SAME:
split = blk_bio_write_same_split(q, *bio, bs, &nsegs);
break;
+ case REQ_OP_ZONE_REPORT:
+ case REQ_OP_ZONE_RESET:
+ case REQ_OP_ZONE_OPEN:
+ case REQ_OP_ZONE_CLOSE:
+ case REQ_OP_ZONE_FINISH:
+ /*
+ * For these commands, bi_size is either 0 to specify
+ * operation on the entire block device sector range,
+ * or a zone size for operation on a single zone.
+ * Since a zone size may be much bigger than the maximum
+ * allowed BIO size, we cannot use blk_bio_segment_split.
+ */
+ split = NULL;
+ nsegs = 0;
+ break;
default:
split = blk_bio_segment_split(q, *bio, q->bio_split, &nsegs);
break;
@@ -241,11 +256,19 @@ static unsigned int __blk_recalc_rq_segments(struct request_queue *q,
* This should probably be returning 0, but blk_add_request_payload()
* (Christoph!!!!)
*/
- if (bio_op(bio) == REQ_OP_DISCARD || bio_op(bio) == REQ_OP_SECURE_ERASE)
- return 1;
-
- if (bio_op(bio) == REQ_OP_WRITE_SAME)
+ switch(bio_op(bio)) {
+ case REQ_OP_DISCARD:
+ case REQ_OP_SECURE_ERASE:
+ case REQ_OP_WRITE_SAME:
+ case REQ_OP_ZONE_REPORT:
+ case REQ_OP_ZONE_RESET:
+ case REQ_OP_ZONE_OPEN:
+ case REQ_OP_ZONE_CLOSE:
+ case REQ_OP_ZONE_FINISH:
return 1;
+ default:
+ break;
+ }
fbio = bio;
cluster = blk_queue_cluster(q);
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 23ddf4b..d9c2e21 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -69,20 +69,38 @@
*/
static inline bool bio_has_data(struct bio *bio)
{
- if (bio &&
- bio->bi_iter.bi_size &&
- bio_op(bio) != REQ_OP_DISCARD &&
- bio_op(bio) != REQ_OP_SECURE_ERASE)
- return true;
+ if (!bio || !bio->bi_iter.bi_size)
+ return false;
- return false;
+ switch (bio_op(bio)) {
+ case REQ_OP_DISCARD:
+ case REQ_OP_SECURE_ERASE:
+ case REQ_OP_ZONE_REPORT:
+ case REQ_OP_ZONE_RESET:
+ case REQ_OP_ZONE_OPEN:
+ case REQ_OP_ZONE_CLOSE:
+ case REQ_OP_ZONE_FINISH:
+ return false;
+ default:
+ return true;
+ }
}
static inline bool bio_no_advance_iter(struct bio *bio)
{
- return bio_op(bio) == REQ_OP_DISCARD ||
- bio_op(bio) == REQ_OP_SECURE_ERASE ||
- bio_op(bio) == REQ_OP_WRITE_SAME;
+ switch (bio_op(bio)) {
+ case REQ_OP_DISCARD:
+ case REQ_OP_SECURE_ERASE:
+ case REQ_OP_WRITE_SAME:
+ case REQ_OP_ZONE_REPORT:
+ case REQ_OP_ZONE_RESET:
+ case REQ_OP_ZONE_OPEN:
+ case REQ_OP_ZONE_CLOSE:
+ case REQ_OP_ZONE_FINISH:
+ return true;
+ default:
+ return false;
+ }
}
static inline bool bio_is_rw(struct bio *bio)
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index 436f43f..70df996 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -229,6 +229,26 @@ enum rq_flag_bits {
#define REQ_HASHED (1ULL << __REQ_HASHED)
#define REQ_MQ_INFLIGHT (1ULL << __REQ_MQ_INFLIGHT)
+/*
+ * Note on zone operations:
+ * All REQ_OP_ZONE_* commands do not have a payload and share a common
+ * interface for specifying operation range:
+ * (1) bio->bi_iter.bi_sector and bio->bi_iter.bi_size set to 0:
+ * the command is to operate on ALL zones of the device.
+ * (2) bio->bi_iter.bi_sector is set to a zone start sector and
+ * bio->bi_iter.bi_size is set to the zone size in bytes:
+ * the command is to operate on only the specified zone.
+ * Operation:
+ * REQ_OP_ZONE_REPORT: Request information for all zones or for a single zone.
+ * REQ_OP_ZONE_RESET: Reset the write pointer of all zones or of a single zone.
+ * REQ_OP_ZONE_OPEN: Explicitely open the maximum allowed number of zones or
+ * a single zone. For the former case, the zones that will
+ * actually be open are chosen by the disk.
+ * REQ_OP_ZONE_CLOSE: Close all implicitely or explicitely open zones or
+ * a single zone.
+ * REQ_OP_ZONE_FINISH: Transition one or all open and closed zones to the full
+ * condition.
+ */
enum req_op {
REQ_OP_READ,
REQ_OP_WRITE,
@@ -236,9 +256,14 @@ enum req_op {
REQ_OP_SECURE_ERASE, /* request to securely erase sectors */
REQ_OP_WRITE_SAME, /* write same block many times */
REQ_OP_FLUSH, /* request for cache flush */
+ REQ_OP_ZONE_REPORT, /* Get zone information */
+ REQ_OP_ZONE_RESET, /* Reset a zone write pointer */
+ REQ_OP_ZONE_OPEN, /* Explicitely open a zone */
+ REQ_OP_ZONE_CLOSE, /* Close an open zone */
+ REQ_OP_ZONE_FINISH, /* Finish a zone */
};
-#define REQ_OP_BITS 3
+#define REQ_OP_BITS 4
typedef unsigned int blk_qc_t;
#define BLK_QC_T_NONE -1U
--
2.7.4
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 5/9] block: Implement support for zoned block devices
2016-09-19 21:27 ` Damien Le Moal
@ 2016-09-19 21:27 ` Damien Le Moal
-1 siblings, 0 replies; 36+ messages in thread
From: Damien Le Moal @ 2016-09-19 21:27 UTC (permalink / raw)
To: linux-scsi, linux-block
Cc: martin.petersen, axboe, hare, shaun.tancheff, Damien Le Moal
From: Hannes Reinecke <hare@suse.de>
Implement a RB-Tree holding a zoned block device zone information
(struct blk_zone) and add support functions for maintaining the
RB-Tree and manipulating zone structs. The block layer support does
not differentiate between host-aware and host-managed devices. The
different constraints for these different zone models are handled
by the generic SCSI layer sd driver down the stack.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Changelog (Damien):
* Changed struct blk_zone to be more compact (64B)
* Changed zone locking to use bit_spin_lock in place of a regular
spinlock
* Request zone operations to the underlying block device driver
through BIO operations with the operation codes REQ_OP_ZONE_*.
Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
---
block/Kconfig | 8 ++
block/Makefile | 1 +
block/blk-core.c | 4 +
block/blk-zoned.c | 338 +++++++++++++++++++++++++++++++++++++++++++++++++
include/linux/blkdev.h | 113 +++++++++++++++++
5 files changed, 464 insertions(+)
create mode 100644 block/blk-zoned.c
diff --git a/block/Kconfig b/block/Kconfig
index 161491d..c3a18f0 100644
--- a/block/Kconfig
+++ b/block/Kconfig
@@ -88,6 +88,14 @@ config BLK_DEV_INTEGRITY
T10/SCSI Data Integrity Field or the T13/ATA External Path
Protection. If in doubt, say N.
+config BLK_DEV_ZONED
+ bool "Zoned block device support"
+ ---help---
+ Block layer zoned block device support. This option enables
+ support for ZAC/ZBC host-managed and host-aware zoned block devices.
+
+ Say yes here if you have a ZAC or ZBC storage device.
+
config BLK_DEV_THROTTLING
bool "Block layer bio throttling support"
depends on BLK_CGROUP=y
diff --git a/block/Makefile b/block/Makefile
index 9eda232..aee67fa 100644
--- a/block/Makefile
+++ b/block/Makefile
@@ -22,4 +22,5 @@ obj-$(CONFIG_IOSCHED_CFQ) += cfq-iosched.o
obj-$(CONFIG_BLOCK_COMPAT) += compat_ioctl.o
obj-$(CONFIG_BLK_CMDLINE_PARSER) += cmdline-parser.o
obj-$(CONFIG_BLK_DEV_INTEGRITY) += bio-integrity.o blk-integrity.o t10-pi.o
+obj-$(CONFIG_BLK_DEV_ZONED) += blk-zoned.o
diff --git a/block/blk-core.c b/block/blk-core.c
index 4a7f7ba..2c5d069d 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -590,6 +590,8 @@ void blk_cleanup_queue(struct request_queue *q)
blk_mq_free_queue(q);
percpu_ref_exit(&q->q_usage_counter);
+ blk_drop_zones(q);
+
spin_lock_irq(lock);
if (q->queue_lock != &q->__queue_lock)
q->queue_lock = &q->__queue_lock;
@@ -728,6 +730,8 @@ struct request_queue *blk_alloc_queue_node(gfp_t gfp_mask, int node_id)
#endif
INIT_DELAYED_WORK(&q->delay_work, blk_delay_work);
+ blk_init_zones(q);
+
kobject_init(&q->kobj, &blk_queue_ktype);
mutex_init(&q->sysfs_lock);
diff --git a/block/blk-zoned.c b/block/blk-zoned.c
new file mode 100644
index 0000000..a107940
--- /dev/null
+++ b/block/blk-zoned.c
@@ -0,0 +1,338 @@
+/*
+ * Zoned block device handling
+ *
+ * Copyright (c) 2015, Hannes Reinecke
+ * Copyright (c) 2015, SUSE Linux GmbH
+ *
+ * Copyright (c) 2016, Damien Le Moal
+ * Copyright (c) 2016, Western Digital
+ */
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/rbtree.h>
+#include <linux/blkdev.h>
+
+void blk_init_zones(struct request_queue *q)
+{
+ spin_lock_init(&q->zones_lock);
+ q->zones = RB_ROOT;
+}
+
+/**
+ * blk_drop_zones - Empty a zoned device zone tree.
+ * @q: queue of the zoned device to operate on
+ *
+ * Free all zone descriptors added to the queue zone tree.
+ */
+void blk_drop_zones(struct request_queue *q)
+{
+ struct rb_root *root = &q->zones;
+ struct blk_zone *zone, *next;
+
+ rbtree_postorder_for_each_entry_safe(zone, next, root, node)
+ kfree(zone);
+ q->zones = RB_ROOT;
+}
+EXPORT_SYMBOL_GPL(blk_drop_zones);
+
+/**
+ * blk_insert_zone - Add a new zone struct to the queue RB-tree.
+ * @q: queue of the zoned device to operate on
+ * @new_zone: The zone struct to add
+ *
+ * If @new_zone is not already added to the zone tree, add it.
+ * Otherwise, return the existing entry.
+ */
+struct blk_zone *blk_insert_zone(struct request_queue *q,
+ struct blk_zone *new_zone)
+{
+ struct rb_root *root = &q->zones;
+ struct rb_node **new = &(root->rb_node), *parent = NULL;
+ struct blk_zone *zone = NULL;
+ unsigned long flags;
+
+ spin_lock_irqsave(&q->zones_lock, flags);
+
+ /* Figure out where to put new node */
+ while (*new) {
+ zone = container_of(*new, struct blk_zone, node);
+ parent = *new;
+ if (new_zone->start + new_zone->len <= zone->start)
+ new = &((*new)->rb_left);
+ else if (new_zone->start >= zone->start + zone->len)
+ new = &((*new)->rb_right);
+ else
+ /* Return existing zone */
+ break;
+ zone = NULL;
+ }
+
+ if (!zone) {
+ /* No existing zone: add new node and rebalance tree */
+ rb_link_node(&new_zone->node, parent, new);
+ rb_insert_color(&new_zone->node, root);
+ }
+
+ spin_unlock_irqrestore(&q->zones_lock, flags);
+
+ return zone;
+}
+EXPORT_SYMBOL_GPL(blk_insert_zone);
+
+/**
+ * blk_lookup_zone - Search a zone in a zoned device zone tree.
+ * @q: queue of the zoned device tree to search
+ * @sector: A sector within the zone to search for
+ *
+ * Search the zone containing @sector in the zone tree owned
+ * by @q. NULL is returned if the zone is not found. Since this
+ * can be called concurrently with blk_insert_zone during device
+ * initialization, the tree traversal is protected using the
+ * zones_lock of the queue.
+ */
+struct blk_zone *blk_lookup_zone(struct request_queue *q, sector_t sector)
+{
+ struct rb_root *root = &q->zones;
+ struct rb_node *node = root->rb_node;
+ struct blk_zone *zone = NULL;
+ unsigned long flags;
+
+ spin_lock_irqsave(&q->zones_lock, flags);
+
+ while (node) {
+ zone = container_of(node, struct blk_zone, node);
+ if (sector < zone->start)
+ node = node->rb_left;
+ else if (sector >= zone->start + zone->len)
+ node = node->rb_right;
+ else
+ break;
+ zone = NULL;
+ }
+
+ spin_unlock_irqrestore(&q->zones_lock, flags);
+
+ return zone;
+}
+EXPORT_SYMBOL_GPL(blk_lookup_zone);
+
+/**
+ * Execute a zone operation (REQ_OP_ZONE*)
+ */
+static int blkdev_issue_zone_operation(struct block_device *bdev,
+ unsigned int op,
+ sector_t sector, sector_t nr_sects,
+ gfp_t gfp_mask)
+{
+ struct bio *bio;
+ int ret;
+
+ if (!bdev_zoned(bdev))
+ return -EOPNOTSUPP;
+
+ /*
+ * Make sure bi_size does not overflow because
+ * of some weird very large zone size.
+ */
+ if (nr_sects && (unsigned long long)nr_sects << 9 > UINT_MAX)
+ return -EINVAL;
+
+ bio = bio_alloc(gfp_mask, 1);
+ if (!bio)
+ return -ENOMEM;
+
+ bio->bi_iter.bi_sector = sector;
+ bio->bi_iter.bi_size = nr_sects << 9;
+ bio->bi_vcnt = 0;
+ bio->bi_bdev = bdev;
+ bio_set_op_attrs(bio, op, 0);
+
+ ret = submit_bio_wait(bio);
+
+ bio_put(bio);
+
+ return ret;
+}
+
+/**
+ * blkdev_update_zones - Force an update of a device zone information
+ * @bdev: Target block device
+ *
+ * Force an update of all zones information of @bdev. This call does not
+ * block waiting for the update to complete. On return, all zones are only
+ * marked as "in-update". Waiting on the zone update to complete can be done
+ * on a per zone basis using the function blk_wait_for_zone_update.
+ */
+int blkdev_update_zones(struct block_device *bdev,
+ gfp_t gfp_mask)
+{
+ return blkdev_issue_zone_operation(bdev, REQ_OP_ZONE_REPORT,
+ 0, 0, gfp_mask);
+}
+
+/*
+ * Wait for a zone update to complete.
+ */
+static void __blk_wait_for_zone_update(struct blk_zone *zone)
+{
+ might_sleep();
+ if (test_bit(BLK_ZONE_IN_UPDATE, &zone->flags))
+ wait_on_bit_io(&zone->flags, BLK_ZONE_IN_UPDATE,
+ TASK_UNINTERRUPTIBLE);
+}
+
+/**
+ * blk_wait_for_zone_update - Wait for a zone information update
+ * @zone: The zone to wait for
+ *
+ * This must be called with the zone lock held. If @zone is not
+ * under update, returns immediately. Otherwise, wait for the
+ * update flag to be cleared on completion of the zone information
+ * update by the device driver.
+ */
+void blk_wait_for_zone_update(struct blk_zone *zone)
+{
+ WARN_ON_ONCE(!test_bit(BLK_ZONE_LOCKED, &zone->flags));
+ while (test_bit(BLK_ZONE_IN_UPDATE, &zone->flags)) {
+ blk_unlock_zone(zone);
+ __blk_wait_for_zone_update(zone);
+ blk_lock_zone(zone);
+ }
+}
+
+/**
+ * blkdev_report_zone - Get a zone information
+ * @bdev: Target block device
+ * @sector: A sector of the zone to report
+ * @update: Force an update of the zone information
+ * @gfp_mask: Memory allocation flags (for bio_alloc)
+ *
+ * Get a zone from the zone cache. And return it.
+ * If update is requested, issue a report zone operation
+ * and wait for the zone information to be updated.
+ */
+struct blk_zone *blkdev_report_zone(struct block_device *bdev,
+ sector_t sector,
+ bool update,
+ gfp_t gfp_mask)
+{
+ struct request_queue *q = bdev_get_queue(bdev);
+ struct blk_zone *zone;
+ int ret;
+
+ zone = blk_lookup_zone(q, sector);
+ if (!zone)
+ return ERR_PTR(-ENXIO);
+
+ if (update) {
+ ret = blkdev_issue_zone_operation(bdev, REQ_OP_ZONE_REPORT,
+ zone->start, zone->len,
+ gfp_mask);
+ if (ret)
+ return ERR_PTR(ret);
+ __blk_wait_for_zone_update(zone);
+ }
+
+ return zone;
+}
+
+/**
+ * Execute a zone action (open, close, reset or finish).
+ */
+static int blkdev_issue_zone_action(struct block_device *bdev,
+ sector_t sector, unsigned int op,
+ gfp_t gfp_mask)
+{
+ struct request_queue *q = bdev_get_queue(bdev);
+ struct blk_zone *zone;
+ sector_t nr_sects;
+ int ret;
+
+ if (!blk_queue_zoned(q))
+ return -EOPNOTSUPP;
+
+ if (sector == ~0ULL) {
+ /* All zones */
+ sector = 0;
+ nr_sects = 0;
+ } else {
+ /* This zone */
+ zone = blk_lookup_zone(q, sector);
+ if (!zone)
+ return -ENXIO;
+ sector = zone->start;
+ nr_sects = zone->len;
+ }
+
+ ret = blkdev_issue_zone_operation(bdev, op, sector,
+ nr_sects, gfp_mask);
+ if (ret == 0 && !nr_sects)
+ blkdev_update_zones(bdev, gfp_mask);
+
+ return ret;
+}
+
+/**
+ * blkdev_reset_zone - Reset a zone write pointer
+ * @bdev: target block device
+ * @sector: A sector of the zone to reset or ~0ul for all zones.
+ * @gfp_mask: memory allocation flags (for bio_alloc)
+ *
+ * Description:
+ * Reset a zone or all zones write pointer.
+ */
+int blkdev_reset_zone(struct block_device *bdev,
+ sector_t sector, gfp_t gfp_mask)
+{
+ return blkdev_issue_zone_action(bdev, sector, REQ_OP_ZONE_RESET,
+ gfp_mask);
+}
+
+/**
+ * blkdev_open_zone - Explicitely open a zone
+ * @bdev: target block device
+ * @sector: A sector of the zone to open or ~0ul for all zones.
+ * @gfp_mask: memory allocation flags (for bio_alloc)
+ *
+ * Description:
+ * Open a zone or all possible zones.
+ */
+int blkdev_open_zone(struct block_device *bdev,
+ sector_t sector, gfp_t gfp_mask)
+{
+ return blkdev_issue_zone_action(bdev, sector, REQ_OP_ZONE_OPEN,
+ gfp_mask);
+}
+
+/**
+ * blkdev_close_zone - Close an open zone
+ * @bdev: target block device
+ * @sector: A sector of the zone to close or ~0ul for all zones.
+ * @gfp_mask: memory allocation flags (for bio_alloc)
+ *
+ * Description:
+ * Close a zone or all open zones.
+ */
+int blkdev_close_zone(struct block_device *bdev,
+ sector_t sector, gfp_t gfp_mask)
+{
+ return blkdev_issue_zone_action(bdev, sector, REQ_OP_ZONE_CLOSE,
+ gfp_mask);
+}
+
+/**
+ * blkdev_finish_zone - Finish a zone (make it full)
+ * @bdev: target block device
+ * @sector: A sector of the zone to close or ~0ul for all zones.
+ * @gfp_mask: memory allocation flags (for bio_alloc)
+ *
+ * Description:
+ * Finish one zone or all possible zones.
+ */
+int blkdev_finish_zone(struct block_device *bdev,
+ sector_t sector, gfp_t gfp_mask)
+{
+ return blkdev_issue_zone_action(bdev, sector, REQ_OP_ZONE_FINISH,
+ gfp_mask);
+}
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 1c74b19..1165594 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -24,6 +24,7 @@
#include <linux/rcupdate.h>
#include <linux/percpu-refcount.h>
#include <linux/scatterlist.h>
+#include <linux/bit_spinlock.h>
struct module;
struct scsi_ioctl_command;
@@ -302,6 +303,113 @@ struct queue_limits {
unsigned char zoned;
};
+#ifdef CONFIG_BLK_DEV_ZONED
+
+enum blk_zone_type {
+ BLK_ZONE_TYPE_UNKNOWN,
+ BLK_ZONE_TYPE_CONVENTIONAL,
+ BLK_ZONE_TYPE_SEQWRITE_REQ,
+ BLK_ZONE_TYPE_SEQWRITE_PREF,
+};
+
+enum blk_zone_cond {
+ BLK_ZONE_COND_NO_WP,
+ BLK_ZONE_COND_EMPTY,
+ BLK_ZONE_COND_IMP_OPEN,
+ BLK_ZONE_COND_EXP_OPEN,
+ BLK_ZONE_COND_CLOSED,
+ BLK_ZONE_COND_READONLY = 0xd,
+ BLK_ZONE_COND_FULL,
+ BLK_ZONE_COND_OFFLINE,
+};
+
+enum blk_zone_flags {
+ BLK_ZONE_LOCKED,
+ BLK_ZONE_WRITE_LOCKED,
+ BLK_ZONE_IN_UPDATE,
+};
+
+/**
+ * Zone descriptor. On 64-bits architectures,
+ * this will align on sizeof(long), i.e. 64 B,
+ * and use 64 B.
+ */
+struct blk_zone {
+ struct rb_node node;
+ unsigned long flags;
+ sector_t len;
+ sector_t start;
+ sector_t wp;
+ unsigned int type : 4;
+ unsigned int cond : 4;
+ unsigned int non_seq : 1;
+ unsigned int reset : 1;
+};
+
+#define blk_zone_is_seq_req(z) ((z)->type == BLK_ZONE_TYPE_SEQWRITE_REQ)
+#define blk_zone_is_seq_pref(z) ((z)->type == BLK_ZONE_TYPE_SEQWRITE_PREF)
+#define blk_zone_is_seq(z) (blk_zone_is_seq_req(z) || blk_zone_is_seq_pref(z))
+#define blk_zone_is_conv(z) ((z)->type == BLK_ZONE_TYPE_CONVENTIONAL)
+
+#define blk_zone_is_readonly(z) ((z)->cond == BLK_ZONE_COND_READONLY)
+#define blk_zone_is_offline(z) ((z)->cond == BLK_ZONE_COND_OFFLINE)
+#define blk_zone_is_full(z) ((z)->cond == BLK_ZONE_COND_FULL)
+#define blk_zone_is_empty(z) ((z)->cond == BLK_ZONE_COND_EMPTY)
+#define blk_zone_is_open(z) ((z)->cond == BLK_ZONE_COND_EXP_OPEN)
+
+static inline void blk_lock_zone(struct blk_zone *zone)
+{
+ bit_spin_lock(BLK_ZONE_LOCKED, &zone->flags);
+}
+
+static inline int blk_trylock_zone(struct blk_zone *zone)
+{
+ return bit_spin_trylock(BLK_ZONE_LOCKED, &zone->flags);
+}
+
+static inline void blk_unlock_zone(struct blk_zone *zone)
+{
+ bit_spin_unlock(BLK_ZONE_LOCKED, &zone->flags);
+}
+
+static inline int blk_try_write_lock_zone(struct blk_zone *zone)
+{
+ return !test_and_set_bit(BLK_ZONE_WRITE_LOCKED, &zone->flags);
+}
+
+static inline void blk_write_unlock_zone(struct blk_zone *zone)
+{
+ clear_bit_unlock(BLK_ZONE_WRITE_LOCKED, &zone->flags);
+ smp_mb__after_atomic();
+}
+
+extern void blk_init_zones(struct request_queue *);
+extern void blk_drop_zones(struct request_queue *);
+extern struct blk_zone *blk_insert_zone(struct request_queue *,
+ struct blk_zone *);
+extern struct blk_zone *blk_lookup_zone(struct request_queue *, sector_t);
+
+extern int blkdev_update_zones(struct block_device *, gfp_t);
+extern void blk_wait_for_zone_update(struct blk_zone *);
+#define blk_zone_in_update(z) test_bit(BLK_ZONE_IN_UPDATE, &(z)->flags)
+static inline void blk_clear_zone_update(struct blk_zone *zone)
+{
+ clear_bit_unlock(BLK_ZONE_IN_UPDATE, &zone->flags);
+ smp_mb__after_atomic();
+ wake_up_bit(&zone->flags, BLK_ZONE_IN_UPDATE);
+}
+
+extern struct blk_zone *blkdev_report_zone(struct block_device *,
+ sector_t, bool, gfp_t);
+extern int blkdev_reset_zone(struct block_device *, sector_t, gfp_t);
+extern int blkdev_open_zone(struct block_device *, sector_t, gfp_t);
+extern int blkdev_close_zone(struct block_device *, sector_t, gfp_t);
+extern int blkdev_finish_zone(struct block_device *, sector_t, gfp_t);
+#else /* CONFIG_BLK_DEV_ZONED */
+static inline void blk_init_zones(struct request_queue *q) { };
+static inline void blk_drop_zones(struct request_queue *q) { };
+#endif /* CONFIG_BLK_DEV_ZONED */
+
struct request_queue {
/*
* Together with queue_head for cacheline sharing
@@ -404,6 +512,11 @@ struct request_queue {
unsigned int nr_pending;
#endif
+#ifdef CONFIG_BLK_DEV_ZONED
+ spinlock_t zones_lock;
+ struct rb_root zones;
+#endif
+
/*
* queue settings
*/
--
2.7.4
Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice & Disclaimer:
This e-mail and any files transmitted with it may contain confidential or legally privileged information of WDC and/or its affiliates, and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system.
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 5/9] block: Implement support for zoned block devices
@ 2016-09-19 21:27 ` Damien Le Moal
0 siblings, 0 replies; 36+ messages in thread
From: Damien Le Moal @ 2016-09-19 21:27 UTC (permalink / raw)
To: linux-scsi, linux-block
Cc: martin.petersen, axboe, hare, shaun.tancheff, Damien Le Moal
From: Hannes Reinecke <hare@suse.de>
Implement a RB-Tree holding a zoned block device zone information
(struct blk_zone) and add support functions for maintaining the
RB-Tree and manipulating zone structs. The block layer support does
not differentiate between host-aware and host-managed devices. The
different constraints for these different zone models are handled
by the generic SCSI layer sd driver down the stack.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Changelog (Damien):
* Changed struct blk_zone to be more compact (64B)
* Changed zone locking to use bit_spin_lock in place of a regular
spinlock
* Request zone operations to the underlying block device driver
through BIO operations with the operation codes REQ_OP_ZONE_*.
Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
---
block/Kconfig | 8 ++
block/Makefile | 1 +
block/blk-core.c | 4 +
block/blk-zoned.c | 338 +++++++++++++++++++++++++++++++++++++++++++++++++
include/linux/blkdev.h | 113 +++++++++++++++++
5 files changed, 464 insertions(+)
create mode 100644 block/blk-zoned.c
diff --git a/block/Kconfig b/block/Kconfig
index 161491d..c3a18f0 100644
--- a/block/Kconfig
+++ b/block/Kconfig
@@ -88,6 +88,14 @@ config BLK_DEV_INTEGRITY
T10/SCSI Data Integrity Field or the T13/ATA External Path
Protection. If in doubt, say N.
+config BLK_DEV_ZONED
+ bool "Zoned block device support"
+ ---help---
+ Block layer zoned block device support. This option enables
+ support for ZAC/ZBC host-managed and host-aware zoned block devices.
+
+ Say yes here if you have a ZAC or ZBC storage device.
+
config BLK_DEV_THROTTLING
bool "Block layer bio throttling support"
depends on BLK_CGROUP=y
diff --git a/block/Makefile b/block/Makefile
index 9eda232..aee67fa 100644
--- a/block/Makefile
+++ b/block/Makefile
@@ -22,4 +22,5 @@ obj-$(CONFIG_IOSCHED_CFQ) += cfq-iosched.o
obj-$(CONFIG_BLOCK_COMPAT) += compat_ioctl.o
obj-$(CONFIG_BLK_CMDLINE_PARSER) += cmdline-parser.o
obj-$(CONFIG_BLK_DEV_INTEGRITY) += bio-integrity.o blk-integrity.o t10-pi.o
+obj-$(CONFIG_BLK_DEV_ZONED) += blk-zoned.o
diff --git a/block/blk-core.c b/block/blk-core.c
index 4a7f7ba..2c5d069d 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -590,6 +590,8 @@ void blk_cleanup_queue(struct request_queue *q)
blk_mq_free_queue(q);
percpu_ref_exit(&q->q_usage_counter);
+ blk_drop_zones(q);
+
spin_lock_irq(lock);
if (q->queue_lock != &q->__queue_lock)
q->queue_lock = &q->__queue_lock;
@@ -728,6 +730,8 @@ struct request_queue *blk_alloc_queue_node(gfp_t gfp_mask, int node_id)
#endif
INIT_DELAYED_WORK(&q->delay_work, blk_delay_work);
+ blk_init_zones(q);
+
kobject_init(&q->kobj, &blk_queue_ktype);
mutex_init(&q->sysfs_lock);
diff --git a/block/blk-zoned.c b/block/blk-zoned.c
new file mode 100644
index 0000000..a107940
--- /dev/null
+++ b/block/blk-zoned.c
@@ -0,0 +1,338 @@
+/*
+ * Zoned block device handling
+ *
+ * Copyright (c) 2015, Hannes Reinecke
+ * Copyright (c) 2015, SUSE Linux GmbH
+ *
+ * Copyright (c) 2016, Damien Le Moal
+ * Copyright (c) 2016, Western Digital
+ */
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/rbtree.h>
+#include <linux/blkdev.h>
+
+void blk_init_zones(struct request_queue *q)
+{
+ spin_lock_init(&q->zones_lock);
+ q->zones = RB_ROOT;
+}
+
+/**
+ * blk_drop_zones - Empty a zoned device zone tree.
+ * @q: queue of the zoned device to operate on
+ *
+ * Free all zone descriptors added to the queue zone tree.
+ */
+void blk_drop_zones(struct request_queue *q)
+{
+ struct rb_root *root = &q->zones;
+ struct blk_zone *zone, *next;
+
+ rbtree_postorder_for_each_entry_safe(zone, next, root, node)
+ kfree(zone);
+ q->zones = RB_ROOT;
+}
+EXPORT_SYMBOL_GPL(blk_drop_zones);
+
+/**
+ * blk_insert_zone - Add a new zone struct to the queue RB-tree.
+ * @q: queue of the zoned device to operate on
+ * @new_zone: The zone struct to add
+ *
+ * If @new_zone is not already added to the zone tree, add it.
+ * Otherwise, return the existing entry.
+ */
+struct blk_zone *blk_insert_zone(struct request_queue *q,
+ struct blk_zone *new_zone)
+{
+ struct rb_root *root = &q->zones;
+ struct rb_node **new = &(root->rb_node), *parent = NULL;
+ struct blk_zone *zone = NULL;
+ unsigned long flags;
+
+ spin_lock_irqsave(&q->zones_lock, flags);
+
+ /* Figure out where to put new node */
+ while (*new) {
+ zone = container_of(*new, struct blk_zone, node);
+ parent = *new;
+ if (new_zone->start + new_zone->len <= zone->start)
+ new = &((*new)->rb_left);
+ else if (new_zone->start >= zone->start + zone->len)
+ new = &((*new)->rb_right);
+ else
+ /* Return existing zone */
+ break;
+ zone = NULL;
+ }
+
+ if (!zone) {
+ /* No existing zone: add new node and rebalance tree */
+ rb_link_node(&new_zone->node, parent, new);
+ rb_insert_color(&new_zone->node, root);
+ }
+
+ spin_unlock_irqrestore(&q->zones_lock, flags);
+
+ return zone;
+}
+EXPORT_SYMBOL_GPL(blk_insert_zone);
+
+/**
+ * blk_lookup_zone - Search a zone in a zoned device zone tree.
+ * @q: queue of the zoned device tree to search
+ * @sector: A sector within the zone to search for
+ *
+ * Search the zone containing @sector in the zone tree owned
+ * by @q. NULL is returned if the zone is not found. Since this
+ * can be called concurrently with blk_insert_zone during device
+ * initialization, the tree traversal is protected using the
+ * zones_lock of the queue.
+ */
+struct blk_zone *blk_lookup_zone(struct request_queue *q, sector_t sector)
+{
+ struct rb_root *root = &q->zones;
+ struct rb_node *node = root->rb_node;
+ struct blk_zone *zone = NULL;
+ unsigned long flags;
+
+ spin_lock_irqsave(&q->zones_lock, flags);
+
+ while (node) {
+ zone = container_of(node, struct blk_zone, node);
+ if (sector < zone->start)
+ node = node->rb_left;
+ else if (sector >= zone->start + zone->len)
+ node = node->rb_right;
+ else
+ break;
+ zone = NULL;
+ }
+
+ spin_unlock_irqrestore(&q->zones_lock, flags);
+
+ return zone;
+}
+EXPORT_SYMBOL_GPL(blk_lookup_zone);
+
+/**
+ * Execute a zone operation (REQ_OP_ZONE*)
+ */
+static int blkdev_issue_zone_operation(struct block_device *bdev,
+ unsigned int op,
+ sector_t sector, sector_t nr_sects,
+ gfp_t gfp_mask)
+{
+ struct bio *bio;
+ int ret;
+
+ if (!bdev_zoned(bdev))
+ return -EOPNOTSUPP;
+
+ /*
+ * Make sure bi_size does not overflow because
+ * of some weird very large zone size.
+ */
+ if (nr_sects && (unsigned long long)nr_sects << 9 > UINT_MAX)
+ return -EINVAL;
+
+ bio = bio_alloc(gfp_mask, 1);
+ if (!bio)
+ return -ENOMEM;
+
+ bio->bi_iter.bi_sector = sector;
+ bio->bi_iter.bi_size = nr_sects << 9;
+ bio->bi_vcnt = 0;
+ bio->bi_bdev = bdev;
+ bio_set_op_attrs(bio, op, 0);
+
+ ret = submit_bio_wait(bio);
+
+ bio_put(bio);
+
+ return ret;
+}
+
+/**
+ * blkdev_update_zones - Force an update of a device zone information
+ * @bdev: Target block device
+ *
+ * Force an update of all zones information of @bdev. This call does not
+ * block waiting for the update to complete. On return, all zones are only
+ * marked as "in-update". Waiting on the zone update to complete can be done
+ * on a per zone basis using the function blk_wait_for_zone_update.
+ */
+int blkdev_update_zones(struct block_device *bdev,
+ gfp_t gfp_mask)
+{
+ return blkdev_issue_zone_operation(bdev, REQ_OP_ZONE_REPORT,
+ 0, 0, gfp_mask);
+}
+
+/*
+ * Wait for a zone update to complete.
+ */
+static void __blk_wait_for_zone_update(struct blk_zone *zone)
+{
+ might_sleep();
+ if (test_bit(BLK_ZONE_IN_UPDATE, &zone->flags))
+ wait_on_bit_io(&zone->flags, BLK_ZONE_IN_UPDATE,
+ TASK_UNINTERRUPTIBLE);
+}
+
+/**
+ * blk_wait_for_zone_update - Wait for a zone information update
+ * @zone: The zone to wait for
+ *
+ * This must be called with the zone lock held. If @zone is not
+ * under update, returns immediately. Otherwise, wait for the
+ * update flag to be cleared on completion of the zone information
+ * update by the device driver.
+ */
+void blk_wait_for_zone_update(struct blk_zone *zone)
+{
+ WARN_ON_ONCE(!test_bit(BLK_ZONE_LOCKED, &zone->flags));
+ while (test_bit(BLK_ZONE_IN_UPDATE, &zone->flags)) {
+ blk_unlock_zone(zone);
+ __blk_wait_for_zone_update(zone);
+ blk_lock_zone(zone);
+ }
+}
+
+/**
+ * blkdev_report_zone - Get a zone information
+ * @bdev: Target block device
+ * @sector: A sector of the zone to report
+ * @update: Force an update of the zone information
+ * @gfp_mask: Memory allocation flags (for bio_alloc)
+ *
+ * Get a zone from the zone cache. And return it.
+ * If update is requested, issue a report zone operation
+ * and wait for the zone information to be updated.
+ */
+struct blk_zone *blkdev_report_zone(struct block_device *bdev,
+ sector_t sector,
+ bool update,
+ gfp_t gfp_mask)
+{
+ struct request_queue *q = bdev_get_queue(bdev);
+ struct blk_zone *zone;
+ int ret;
+
+ zone = blk_lookup_zone(q, sector);
+ if (!zone)
+ return ERR_PTR(-ENXIO);
+
+ if (update) {
+ ret = blkdev_issue_zone_operation(bdev, REQ_OP_ZONE_REPORT,
+ zone->start, zone->len,
+ gfp_mask);
+ if (ret)
+ return ERR_PTR(ret);
+ __blk_wait_for_zone_update(zone);
+ }
+
+ return zone;
+}
+
+/**
+ * Execute a zone action (open, close, reset or finish).
+ */
+static int blkdev_issue_zone_action(struct block_device *bdev,
+ sector_t sector, unsigned int op,
+ gfp_t gfp_mask)
+{
+ struct request_queue *q = bdev_get_queue(bdev);
+ struct blk_zone *zone;
+ sector_t nr_sects;
+ int ret;
+
+ if (!blk_queue_zoned(q))
+ return -EOPNOTSUPP;
+
+ if (sector == ~0ULL) {
+ /* All zones */
+ sector = 0;
+ nr_sects = 0;
+ } else {
+ /* This zone */
+ zone = blk_lookup_zone(q, sector);
+ if (!zone)
+ return -ENXIO;
+ sector = zone->start;
+ nr_sects = zone->len;
+ }
+
+ ret = blkdev_issue_zone_operation(bdev, op, sector,
+ nr_sects, gfp_mask);
+ if (ret == 0 && !nr_sects)
+ blkdev_update_zones(bdev, gfp_mask);
+
+ return ret;
+}
+
+/**
+ * blkdev_reset_zone - Reset a zone write pointer
+ * @bdev: target block device
+ * @sector: A sector of the zone to reset or ~0ul for all zones.
+ * @gfp_mask: memory allocation flags (for bio_alloc)
+ *
+ * Description:
+ * Reset a zone or all zones write pointer.
+ */
+int blkdev_reset_zone(struct block_device *bdev,
+ sector_t sector, gfp_t gfp_mask)
+{
+ return blkdev_issue_zone_action(bdev, sector, REQ_OP_ZONE_RESET,
+ gfp_mask);
+}
+
+/**
+ * blkdev_open_zone - Explicitely open a zone
+ * @bdev: target block device
+ * @sector: A sector of the zone to open or ~0ul for all zones.
+ * @gfp_mask: memory allocation flags (for bio_alloc)
+ *
+ * Description:
+ * Open a zone or all possible zones.
+ */
+int blkdev_open_zone(struct block_device *bdev,
+ sector_t sector, gfp_t gfp_mask)
+{
+ return blkdev_issue_zone_action(bdev, sector, REQ_OP_ZONE_OPEN,
+ gfp_mask);
+}
+
+/**
+ * blkdev_close_zone - Close an open zone
+ * @bdev: target block device
+ * @sector: A sector of the zone to close or ~0ul for all zones.
+ * @gfp_mask: memory allocation flags (for bio_alloc)
+ *
+ * Description:
+ * Close a zone or all open zones.
+ */
+int blkdev_close_zone(struct block_device *bdev,
+ sector_t sector, gfp_t gfp_mask)
+{
+ return blkdev_issue_zone_action(bdev, sector, REQ_OP_ZONE_CLOSE,
+ gfp_mask);
+}
+
+/**
+ * blkdev_finish_zone - Finish a zone (make it full)
+ * @bdev: target block device
+ * @sector: A sector of the zone to close or ~0ul for all zones.
+ * @gfp_mask: memory allocation flags (for bio_alloc)
+ *
+ * Description:
+ * Finish one zone or all possible zones.
+ */
+int blkdev_finish_zone(struct block_device *bdev,
+ sector_t sector, gfp_t gfp_mask)
+{
+ return blkdev_issue_zone_action(bdev, sector, REQ_OP_ZONE_FINISH,
+ gfp_mask);
+}
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 1c74b19..1165594 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -24,6 +24,7 @@
#include <linux/rcupdate.h>
#include <linux/percpu-refcount.h>
#include <linux/scatterlist.h>
+#include <linux/bit_spinlock.h>
struct module;
struct scsi_ioctl_command;
@@ -302,6 +303,113 @@ struct queue_limits {
unsigned char zoned;
};
+#ifdef CONFIG_BLK_DEV_ZONED
+
+enum blk_zone_type {
+ BLK_ZONE_TYPE_UNKNOWN,
+ BLK_ZONE_TYPE_CONVENTIONAL,
+ BLK_ZONE_TYPE_SEQWRITE_REQ,
+ BLK_ZONE_TYPE_SEQWRITE_PREF,
+};
+
+enum blk_zone_cond {
+ BLK_ZONE_COND_NO_WP,
+ BLK_ZONE_COND_EMPTY,
+ BLK_ZONE_COND_IMP_OPEN,
+ BLK_ZONE_COND_EXP_OPEN,
+ BLK_ZONE_COND_CLOSED,
+ BLK_ZONE_COND_READONLY = 0xd,
+ BLK_ZONE_COND_FULL,
+ BLK_ZONE_COND_OFFLINE,
+};
+
+enum blk_zone_flags {
+ BLK_ZONE_LOCKED,
+ BLK_ZONE_WRITE_LOCKED,
+ BLK_ZONE_IN_UPDATE,
+};
+
+/**
+ * Zone descriptor. On 64-bits architectures,
+ * this will align on sizeof(long), i.e. 64 B,
+ * and use 64 B.
+ */
+struct blk_zone {
+ struct rb_node node;
+ unsigned long flags;
+ sector_t len;
+ sector_t start;
+ sector_t wp;
+ unsigned int type : 4;
+ unsigned int cond : 4;
+ unsigned int non_seq : 1;
+ unsigned int reset : 1;
+};
+
+#define blk_zone_is_seq_req(z) ((z)->type == BLK_ZONE_TYPE_SEQWRITE_REQ)
+#define blk_zone_is_seq_pref(z) ((z)->type == BLK_ZONE_TYPE_SEQWRITE_PREF)
+#define blk_zone_is_seq(z) (blk_zone_is_seq_req(z) || blk_zone_is_seq_pref(z))
+#define blk_zone_is_conv(z) ((z)->type == BLK_ZONE_TYPE_CONVENTIONAL)
+
+#define blk_zone_is_readonly(z) ((z)->cond == BLK_ZONE_COND_READONLY)
+#define blk_zone_is_offline(z) ((z)->cond == BLK_ZONE_COND_OFFLINE)
+#define blk_zone_is_full(z) ((z)->cond == BLK_ZONE_COND_FULL)
+#define blk_zone_is_empty(z) ((z)->cond == BLK_ZONE_COND_EMPTY)
+#define blk_zone_is_open(z) ((z)->cond == BLK_ZONE_COND_EXP_OPEN)
+
+static inline void blk_lock_zone(struct blk_zone *zone)
+{
+ bit_spin_lock(BLK_ZONE_LOCKED, &zone->flags);
+}
+
+static inline int blk_trylock_zone(struct blk_zone *zone)
+{
+ return bit_spin_trylock(BLK_ZONE_LOCKED, &zone->flags);
+}
+
+static inline void blk_unlock_zone(struct blk_zone *zone)
+{
+ bit_spin_unlock(BLK_ZONE_LOCKED, &zone->flags);
+}
+
+static inline int blk_try_write_lock_zone(struct blk_zone *zone)
+{
+ return !test_and_set_bit(BLK_ZONE_WRITE_LOCKED, &zone->flags);
+}
+
+static inline void blk_write_unlock_zone(struct blk_zone *zone)
+{
+ clear_bit_unlock(BLK_ZONE_WRITE_LOCKED, &zone->flags);
+ smp_mb__after_atomic();
+}
+
+extern void blk_init_zones(struct request_queue *);
+extern void blk_drop_zones(struct request_queue *);
+extern struct blk_zone *blk_insert_zone(struct request_queue *,
+ struct blk_zone *);
+extern struct blk_zone *blk_lookup_zone(struct request_queue *, sector_t);
+
+extern int blkdev_update_zones(struct block_device *, gfp_t);
+extern void blk_wait_for_zone_update(struct blk_zone *);
+#define blk_zone_in_update(z) test_bit(BLK_ZONE_IN_UPDATE, &(z)->flags)
+static inline void blk_clear_zone_update(struct blk_zone *zone)
+{
+ clear_bit_unlock(BLK_ZONE_IN_UPDATE, &zone->flags);
+ smp_mb__after_atomic();
+ wake_up_bit(&zone->flags, BLK_ZONE_IN_UPDATE);
+}
+
+extern struct blk_zone *blkdev_report_zone(struct block_device *,
+ sector_t, bool, gfp_t);
+extern int blkdev_reset_zone(struct block_device *, sector_t, gfp_t);
+extern int blkdev_open_zone(struct block_device *, sector_t, gfp_t);
+extern int blkdev_close_zone(struct block_device *, sector_t, gfp_t);
+extern int blkdev_finish_zone(struct block_device *, sector_t, gfp_t);
+#else /* CONFIG_BLK_DEV_ZONED */
+static inline void blk_init_zones(struct request_queue *q) { };
+static inline void blk_drop_zones(struct request_queue *q) { };
+#endif /* CONFIG_BLK_DEV_ZONED */
+
struct request_queue {
/*
* Together with queue_head for cacheline sharing
@@ -404,6 +512,11 @@ struct request_queue {
unsigned int nr_pending;
#endif
+#ifdef CONFIG_BLK_DEV_ZONED
+ spinlock_t zones_lock;
+ struct rb_root zones;
+#endif
+
/*
* queue settings
*/
--
2.7.4
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 6/9] block: Add 'BLKPREP_DONE' return value
2016-09-19 21:27 ` Damien Le Moal
@ 2016-09-19 21:27 ` Damien Le Moal
-1 siblings, 0 replies; 36+ messages in thread
From: Damien Le Moal @ 2016-09-19 21:27 UTC (permalink / raw)
To: linux-scsi, linux-block
Cc: martin.petersen, axboe, hare, shaun.tancheff, Damien Le Moal
From: Hannes Reinecke <hare@suse.de>
Add a new blkprep return code BLKPREP_DONE to signal completion
without I/O error.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Changelog (Damien):
Rewrite adding blk_prep_end_request as suggested by Christoph Hellwig
Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
---
block/blk-core.c | 42 ++++++++++++++++++++++++++----------------
drivers/scsi/scsi_lib.c | 1 +
include/linux/blkdev.h | 1 +
3 files changed, 28 insertions(+), 16 deletions(-)
diff --git a/block/blk-core.c b/block/blk-core.c
index 2c5d069d..8dbbb1a 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -2341,6 +2341,17 @@ void blk_account_io_start(struct request *rq, bool new_io)
part_stat_unlock();
}
+static void blk_prep_end_request(struct request *rq, int error)
+{
+ /*
+ * Mark this request as started so we don't trigger
+ * any debug logic in the end I/O path.
+ */
+ rq->cmd_flags |= REQ_QUIET;
+ blk_start_request(rq);
+ __blk_end_request_all(rq, error);
+}
+
/**
* blk_peek_request - peek at the top of a request queue
* @q: request queue to peek at
@@ -2408,9 +2419,10 @@ struct request *blk_peek_request(struct request_queue *q)
break;
ret = q->prep_rq_fn(q, rq);
- if (ret == BLKPREP_OK) {
- break;
- } else if (ret == BLKPREP_DEFER) {
+ switch(ret) {
+ case BLKPREP_OK:
+ goto out;
+ case BLKPREP_DEFER:
/*
* the request may have been (partially) prepped.
* we need to keep this request in the front to
@@ -2425,25 +2437,23 @@ struct request *blk_peek_request(struct request_queue *q)
*/
--rq->nr_phys_segments;
}
-
rq = NULL;
+ goto out;
+ case BLKPREP_KILL:
+ blk_prep_end_request(rq, -EIO);
break;
- } else if (ret == BLKPREP_KILL || ret == BLKPREP_INVALID) {
- int err = (ret == BLKPREP_INVALID) ? -EREMOTEIO : -EIO;
-
- rq->cmd_flags |= REQ_QUIET;
- /*
- * Mark this request as started so we don't trigger
- * any debug logic in the end I/O path.
- */
- blk_start_request(rq);
- __blk_end_request_all(rq, err);
- } else {
+ case BLKPREP_INVALID:
+ blk_prep_end_request(rq, -EREMOTEIO);
+ break;
+ case BLKPREP_DONE:
+ blk_prep_end_request(rq, 0);
+ break;
+ default:
printk(KERN_ERR "%s: bad return=%d\n", __func__, ret);
break;
}
}
-
+out:
return rq;
}
EXPORT_SYMBOL(blk_peek_request);
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index c71344a..f99504d 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1260,6 +1260,7 @@ scsi_prep_return(struct request_queue *q, struct request *req, int ret)
case BLKPREP_KILL:
case BLKPREP_INVALID:
req->errors = DID_NO_CONNECT << 16;
+ case BLKPREP_DONE:
/* release the command and kill it */
if (req->special) {
struct scsi_cmnd *cmd = req->special;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 1165594..a85f95b 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -819,6 +819,7 @@ enum {
BLKPREP_KILL, /* fatal error, kill, return -EIO */
BLKPREP_DEFER, /* leave on queue */
BLKPREP_INVALID, /* invalid command, kill, return -EREMOTEIO */
+ BLKPREP_DONE, /* complete w/o error */
};
extern unsigned long blk_max_low_pfn, blk_max_pfn;
--
2.7.4
Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice & Disclaimer:
This e-mail and any files transmitted with it may contain confidential or legally privileged information of WDC and/or its affiliates, and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system.
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 6/9] block: Add 'BLKPREP_DONE' return value
@ 2016-09-19 21:27 ` Damien Le Moal
0 siblings, 0 replies; 36+ messages in thread
From: Damien Le Moal @ 2016-09-19 21:27 UTC (permalink / raw)
To: linux-scsi, linux-block
Cc: martin.petersen, axboe, hare, shaun.tancheff, Damien Le Moal
From: Hannes Reinecke <hare@suse.de>
Add a new blkprep return code BLKPREP_DONE to signal completion
without I/O error.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Changelog (Damien):
Rewrite adding blk_prep_end_request as suggested by Christoph Hellwig
Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
---
block/blk-core.c | 42 ++++++++++++++++++++++++++----------------
drivers/scsi/scsi_lib.c | 1 +
include/linux/blkdev.h | 1 +
3 files changed, 28 insertions(+), 16 deletions(-)
diff --git a/block/blk-core.c b/block/blk-core.c
index 2c5d069d..8dbbb1a 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -2341,6 +2341,17 @@ void blk_account_io_start(struct request *rq, bool new_io)
part_stat_unlock();
}
+static void blk_prep_end_request(struct request *rq, int error)
+{
+ /*
+ * Mark this request as started so we don't trigger
+ * any debug logic in the end I/O path.
+ */
+ rq->cmd_flags |= REQ_QUIET;
+ blk_start_request(rq);
+ __blk_end_request_all(rq, error);
+}
+
/**
* blk_peek_request - peek at the top of a request queue
* @q: request queue to peek at
@@ -2408,9 +2419,10 @@ struct request *blk_peek_request(struct request_queue *q)
break;
ret = q->prep_rq_fn(q, rq);
- if (ret == BLKPREP_OK) {
- break;
- } else if (ret == BLKPREP_DEFER) {
+ switch(ret) {
+ case BLKPREP_OK:
+ goto out;
+ case BLKPREP_DEFER:
/*
* the request may have been (partially) prepped.
* we need to keep this request in the front to
@@ -2425,25 +2437,23 @@ struct request *blk_peek_request(struct request_queue *q)
*/
--rq->nr_phys_segments;
}
-
rq = NULL;
+ goto out;
+ case BLKPREP_KILL:
+ blk_prep_end_request(rq, -EIO);
break;
- } else if (ret == BLKPREP_KILL || ret == BLKPREP_INVALID) {
- int err = (ret == BLKPREP_INVALID) ? -EREMOTEIO : -EIO;
-
- rq->cmd_flags |= REQ_QUIET;
- /*
- * Mark this request as started so we don't trigger
- * any debug logic in the end I/O path.
- */
- blk_start_request(rq);
- __blk_end_request_all(rq, err);
- } else {
+ case BLKPREP_INVALID:
+ blk_prep_end_request(rq, -EREMOTEIO);
+ break;
+ case BLKPREP_DONE:
+ blk_prep_end_request(rq, 0);
+ break;
+ default:
printk(KERN_ERR "%s: bad return=%d\n", __func__, ret);
break;
}
}
-
+out:
return rq;
}
EXPORT_SYMBOL(blk_peek_request);
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index c71344a..f99504d 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1260,6 +1260,7 @@ scsi_prep_return(struct request_queue *q, struct request *req, int ret)
case BLKPREP_KILL:
case BLKPREP_INVALID:
req->errors = DID_NO_CONNECT << 16;
+ case BLKPREP_DONE:
/* release the command and kill it */
if (req->special) {
struct scsi_cmnd *cmd = req->special;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 1165594..a85f95b 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -819,6 +819,7 @@ enum {
BLKPREP_KILL, /* fatal error, kill, return -EIO */
BLKPREP_DEFER, /* leave on queue */
BLKPREP_INVALID, /* invalid command, kill, return -EREMOTEIO */
+ BLKPREP_DONE, /* complete w/o error */
};
extern unsigned long blk_max_low_pfn, blk_max_pfn;
--
2.7.4
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 7/9] block: Add 'BLK_MQ_RQ_QUEUE_DONE' return value
2016-09-19 21:27 ` Damien Le Moal
@ 2016-09-19 21:27 ` Damien Le Moal
-1 siblings, 0 replies; 36+ messages in thread
From: Damien Le Moal @ 2016-09-19 21:27 UTC (permalink / raw)
To: linux-scsi, linux-block
Cc: martin.petersen, axboe, hare, shaun.tancheff, Hannes Reinecke,
Damien Le Moal
From: Hannes Reinecke <hare@suse.de>
Add a return value BLK_MQ_RQ_QUEUE_DONE to terminate a request
without error.
Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
---
block/blk-mq.c | 1 +
drivers/scsi/scsi_lib.c | 3 +++
include/linux/blk-mq.h | 1 +
3 files changed, 5 insertions(+)
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 13f5a6c..6300629 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -851,6 +851,7 @@ static void __blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx)
pr_err("blk-mq: bad return on queue: %d\n", ret);
case BLK_MQ_RQ_QUEUE_ERROR:
rq->errors = -EIO;
+ case BLK_MQ_RQ_QUEUE_DONE:
blk_mq_end_request(rq, rq->errors);
break;
}
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index f99504d..793b791 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1805,6 +1805,8 @@ static inline int prep_to_mq(int ret)
return 0;
case BLKPREP_DEFER:
return BLK_MQ_RQ_QUEUE_BUSY;
+ case BLKPREP_DONE:
+ return BLK_MQ_RQ_QUEUE_DONE;
default:
return BLK_MQ_RQ_QUEUE_ERROR;
}
@@ -1948,6 +1950,7 @@ out:
blk_mq_delay_queue(hctx, SCSI_QUEUE_DELAY);
break;
case BLK_MQ_RQ_QUEUE_ERROR:
+ case BLK_MQ_RQ_QUEUE_DONE:
/*
* Make sure to release all allocated ressources when
* we hit an error, as we will never see this command
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index e43bbff..07b4888 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -153,6 +153,7 @@ enum {
BLK_MQ_RQ_QUEUE_OK = 0, /* queued fine */
BLK_MQ_RQ_QUEUE_BUSY = 1, /* requeue IO for later */
BLK_MQ_RQ_QUEUE_ERROR = 2, /* end IO with error */
+ BLK_MQ_RQ_QUEUE_DONE = 3, /* end IO w/o error */
BLK_MQ_F_SHOULD_MERGE = 1 << 0,
BLK_MQ_F_TAG_SHARED = 1 << 1,
--
2.7.4
Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice & Disclaimer:
This e-mail and any files transmitted with it may contain confidential or legally privileged information of WDC and/or its affiliates, and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system.
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 7/9] block: Add 'BLK_MQ_RQ_QUEUE_DONE' return value
@ 2016-09-19 21:27 ` Damien Le Moal
0 siblings, 0 replies; 36+ messages in thread
From: Damien Le Moal @ 2016-09-19 21:27 UTC (permalink / raw)
To: linux-scsi, linux-block
Cc: martin.petersen, axboe, hare, shaun.tancheff, Hannes Reinecke,
Damien Le Moal
From: Hannes Reinecke <hare@suse.de>
Add a return value BLK_MQ_RQ_QUEUE_DONE to terminate a request
without error.
Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
---
block/blk-mq.c | 1 +
drivers/scsi/scsi_lib.c | 3 +++
include/linux/blk-mq.h | 1 +
3 files changed, 5 insertions(+)
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 13f5a6c..6300629 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -851,6 +851,7 @@ static void __blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx)
pr_err("blk-mq: bad return on queue: %d\n", ret);
case BLK_MQ_RQ_QUEUE_ERROR:
rq->errors = -EIO;
+ case BLK_MQ_RQ_QUEUE_DONE:
blk_mq_end_request(rq, rq->errors);
break;
}
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index f99504d..793b791 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1805,6 +1805,8 @@ static inline int prep_to_mq(int ret)
return 0;
case BLKPREP_DEFER:
return BLK_MQ_RQ_QUEUE_BUSY;
+ case BLKPREP_DONE:
+ return BLK_MQ_RQ_QUEUE_DONE;
default:
return BLK_MQ_RQ_QUEUE_ERROR;
}
@@ -1948,6 +1950,7 @@ out:
blk_mq_delay_queue(hctx, SCSI_QUEUE_DELAY);
break;
case BLK_MQ_RQ_QUEUE_ERROR:
+ case BLK_MQ_RQ_QUEUE_DONE:
/*
* Make sure to release all allocated ressources when
* we hit an error, as we will never see this command
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index e43bbff..07b4888 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -153,6 +153,7 @@ enum {
BLK_MQ_RQ_QUEUE_OK = 0, /* queued fine */
BLK_MQ_RQ_QUEUE_BUSY = 1, /* requeue IO for later */
BLK_MQ_RQ_QUEUE_ERROR = 2, /* end IO with error */
+ BLK_MQ_RQ_QUEUE_DONE = 3, /* end IO w/o error */
BLK_MQ_F_SHOULD_MERGE = 1 << 0,
BLK_MQ_F_TAG_SHARED = 1 << 1,
--
2.7.4
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 8/9] sd: Implement support for ZBC devices
2016-09-19 21:27 ` Damien Le Moal
@ 2016-09-19 21:27 ` Damien Le Moal
-1 siblings, 0 replies; 36+ messages in thread
From: Damien Le Moal @ 2016-09-19 21:27 UTC (permalink / raw)
To: linux-scsi, linux-block
Cc: martin.petersen, axboe, hare, shaun.tancheff, Hannes Reinecke,
Damien Le Moal
From: Hannes Reinecke <hare@suse.com>
Implement ZBC support functions to setup zoned disks and fill the
block device zone information tree during the device scan. The
zone information tree is also always updated on disk revalidation.
This adds support for the REQ_OP_ZONE* operations and also implements
the new RESET_WP provisioning mode so that discard requests can be
mapped to the RESET WRITE POINTER command for devices with a constant
zone size.
The capacity read of the device triggers the zone information read
for zoned block devices. As this needs the device zone model, the
the call to sd_read_capacity is moved after the call to
sd_read_block_characteristics so that host-aware devices are
properlly initialized. The call to sd_zbc_read_zones in
sd_read_capacity may change the device capacity obtained with
the sd_read_capacity_16 function for devices reporting only the
capacity of conventional zones at the beginning of the LBA range
(i.e. devices with rc_basis et to 0).
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
---
drivers/scsi/Makefile | 1 +
drivers/scsi/sd.c | 147 ++++--
drivers/scsi/sd.h | 68 +++
drivers/scsi/sd_zbc.c | 1097 +++++++++++++++++++++++++++++++++++++++++++++
include/scsi/scsi_proto.h | 17 +
5 files changed, 1304 insertions(+), 26 deletions(-)
create mode 100644 drivers/scsi/sd_zbc.c
diff --git a/drivers/scsi/Makefile b/drivers/scsi/Makefile
index d539798..fabcb6d 100644
--- a/drivers/scsi/Makefile
+++ b/drivers/scsi/Makefile
@@ -179,6 +179,7 @@ hv_storvsc-y := storvsc_drv.o
sd_mod-objs := sd.o
sd_mod-$(CONFIG_BLK_DEV_INTEGRITY) += sd_dif.o
+sd_mod-$(CONFIG_BLK_DEV_ZONED) += sd_zbc.o
sr_mod-objs := sr.o sr_ioctl.o sr_vendor.o
ncr53c8xx-flags-$(CONFIG_SCSI_ZALON) \
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index d3e852a..46b8b78 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -92,6 +92,7 @@ MODULE_ALIAS_BLOCKDEV_MAJOR(SCSI_DISK15_MAJOR);
MODULE_ALIAS_SCSI_DEVICE(TYPE_DISK);
MODULE_ALIAS_SCSI_DEVICE(TYPE_MOD);
MODULE_ALIAS_SCSI_DEVICE(TYPE_RBC);
+MODULE_ALIAS_SCSI_DEVICE(TYPE_ZBC);
#if !defined(CONFIG_DEBUG_BLOCK_EXT_DEVT)
#define SD_MINORS 16
@@ -99,7 +100,6 @@ MODULE_ALIAS_SCSI_DEVICE(TYPE_RBC);
#define SD_MINORS 0
#endif
-static void sd_config_discard(struct scsi_disk *, unsigned int);
static void sd_config_write_same(struct scsi_disk *);
static int sd_revalidate_disk(struct gendisk *);
static void sd_unlock_native_capacity(struct gendisk *disk);
@@ -162,7 +162,7 @@ cache_type_store(struct device *dev, struct device_attribute *attr,
static const char temp[] = "temporary ";
int len;
- if (sdp->type != TYPE_DISK)
+ if (sdp->type != TYPE_DISK && sdp->type != TYPE_ZBC)
/* no cache control on RBC devices; theoretically they
* can do it, but there's probably so many exceptions
* it's not worth the risk */
@@ -261,7 +261,7 @@ allow_restart_store(struct device *dev, struct device_attribute *attr,
if (!capable(CAP_SYS_ADMIN))
return -EACCES;
- if (sdp->type != TYPE_DISK)
+ if (sdp->type != TYPE_DISK && sdp->type != TYPE_ZBC)
return -EINVAL;
sdp->allow_restart = simple_strtoul(buf, NULL, 10);
@@ -369,6 +369,7 @@ static const char *lbp_mode[] = {
[SD_LBP_WS16] = "writesame_16",
[SD_LBP_WS10] = "writesame_10",
[SD_LBP_ZERO] = "writesame_zero",
+ [SD_ZBC_RESET_WP] = "reset_wp",
[SD_LBP_DISABLE] = "disabled",
};
@@ -391,6 +392,13 @@ provisioning_mode_store(struct device *dev, struct device_attribute *attr,
if (!capable(CAP_SYS_ADMIN))
return -EACCES;
+ if (sdkp->zoned == 1 || sdp->type == TYPE_ZBC) {
+ if (!strncmp(buf, lbp_mode[SD_ZBC_RESET_WP], 20)) {
+ sd_config_discard(sdkp, SD_ZBC_RESET_WP);
+ return count;
+ }
+ return -EINVAL;
+ }
if (sdp->type != TYPE_DISK)
return -EINVAL;
@@ -458,7 +466,7 @@ max_write_same_blocks_store(struct device *dev, struct device_attribute *attr,
if (!capable(CAP_SYS_ADMIN))
return -EACCES;
- if (sdp->type != TYPE_DISK)
+ if (sdp->type != TYPE_DISK && sdp->type != TYPE_ZBC)
return -EINVAL;
err = kstrtoul(buf, 10, &max);
@@ -631,7 +639,7 @@ static unsigned char sd_setup_protect_cmnd(struct scsi_cmnd *scmd,
return protect;
}
-static void sd_config_discard(struct scsi_disk *sdkp, unsigned int mode)
+void sd_config_discard(struct scsi_disk *sdkp, unsigned int mode)
{
struct request_queue *q = sdkp->disk->queue;
unsigned int logical_block_size = sdkp->device->sector_size;
@@ -683,6 +691,11 @@ static void sd_config_discard(struct scsi_disk *sdkp, unsigned int mode)
q->limits.discard_zeroes_data = sdkp->lbprz;
break;
+ case SD_ZBC_RESET_WP:
+ max_blocks = min_not_zero(sdkp->max_unmap_blocks,
+ (u32)SD_MAX_WS16_BLOCKS);
+ break;
+
case SD_LBP_ZERO:
max_blocks = min_not_zero(sdkp->max_ws_blocks,
(u32)SD_MAX_WS10_BLOCKS);
@@ -711,16 +724,20 @@ static int sd_setup_discard_cmnd(struct scsi_cmnd *cmd)
unsigned int nr_sectors = blk_rq_sectors(rq);
unsigned int nr_bytes = blk_rq_bytes(rq);
unsigned int len;
- int ret;
+ int ret = BLKPREP_OK;
char *buf;
- struct page *page;
+ struct page *page = NULL;
sector >>= ilog2(sdp->sector_size) - 9;
nr_sectors >>= ilog2(sdp->sector_size) - 9;
- page = alloc_page(GFP_ATOMIC | __GFP_ZERO);
- if (!page)
- return BLKPREP_DEFER;
+ if (sdkp->provisioning_mode != SD_ZBC_RESET_WP) {
+ page = alloc_page(GFP_ATOMIC | __GFP_ZERO);
+ if (!page)
+ return BLKPREP_DEFER;
+ }
+
+ rq->completion_data = page;
switch (sdkp->provisioning_mode) {
case SD_LBP_UNMAP:
@@ -760,12 +777,19 @@ static int sd_setup_discard_cmnd(struct scsi_cmnd *cmd)
len = sdkp->device->sector_size;
break;
+ case SD_ZBC_RESET_WP:
+ ret = sd_zbc_setup_reset_cmnd(cmd);
+ if (ret != BLKPREP_OK)
+ goto out;
+ /* Reset Write Pointer doesn't have a payload */
+ len = 0;
+ break;
+
default:
ret = BLKPREP_INVALID;
goto out;
}
- rq->completion_data = page;
rq->timeout = SD_TIMEOUT;
cmd->transfersize = len;
@@ -779,13 +803,17 @@ static int sd_setup_discard_cmnd(struct scsi_cmnd *cmd)
* discarded on disk. This allows us to report completion on the full
* amount of blocks described by the request.
*/
- blk_add_request_payload(rq, page, 0, len);
- ret = scsi_init_io(cmd);
+ if (len) {
+ blk_add_request_payload(rq, page, 0, len);
+ ret = scsi_init_io(cmd);
+ }
rq->__data_len = nr_bytes;
out:
- if (ret != BLKPREP_OK)
+ if (page && ret != BLKPREP_OK) {
+ rq->completion_data = NULL;
__free_page(page);
+ }
return ret;
}
@@ -843,6 +871,13 @@ static int sd_setup_write_same_cmnd(struct scsi_cmnd *cmd)
BUG_ON(bio_offset(bio) || bio_iovec(bio).bv_len != sdp->sector_size);
+ if (sdkp->zoned == 1 || sdp->type == TYPE_ZBC) {
+ /* sd_zbc_setup_read_write uses block layer sector units */
+ ret = sd_zbc_setup_read_write(sdkp, rq, sector, &nr_sectors);
+ if (ret != BLKPREP_OK)
+ return ret;
+ }
+
sector >>= ilog2(sdp->sector_size) - 9;
nr_sectors >>= ilog2(sdp->sector_size) - 9;
@@ -962,6 +997,13 @@ static int sd_setup_read_write_cmnd(struct scsi_cmnd *SCpnt)
SCSI_LOG_HLQUEUE(2, scmd_printk(KERN_INFO, SCpnt, "block=%llu\n",
(unsigned long long)block));
+ if (sdkp->zoned == 1 || sdp->type == TYPE_ZBC) {
+ /* sd_zbc_setup_read_write uses block layer sector units */
+ ret = sd_zbc_setup_read_write(sdkp, rq, block, &this_count);
+ if (ret != BLKPREP_OK)
+ goto out;
+ }
+
/*
* If we have a 1K hardware sectorsize, prevent access to single
* 512 byte sectors. In theory we could handle this - in fact
@@ -1148,6 +1190,16 @@ static int sd_init_command(struct scsi_cmnd *cmd)
case REQ_OP_READ:
case REQ_OP_WRITE:
return sd_setup_read_write_cmnd(cmd);
+ case REQ_OP_ZONE_REPORT:
+ return sd_zbc_setup_report_cmnd(cmd);
+ case REQ_OP_ZONE_RESET:
+ return sd_zbc_setup_reset_cmnd(cmd);
+ case REQ_OP_ZONE_OPEN:
+ return sd_zbc_setup_open_cmnd(cmd);
+ case REQ_OP_ZONE_CLOSE:
+ return sd_zbc_setup_close_cmnd(cmd);
+ case REQ_OP_ZONE_FINISH:
+ return sd_zbc_setup_finish_cmnd(cmd);
default:
BUG();
}
@@ -1157,7 +1209,8 @@ static void sd_uninit_command(struct scsi_cmnd *SCpnt)
{
struct request *rq = SCpnt->request;
- if (req_op(rq) == REQ_OP_DISCARD)
+ if (req_op(rq) == REQ_OP_DISCARD &&
+ rq->completion_data)
__free_page(rq->completion_data);
if (SCpnt->cmnd != rq->cmd) {
@@ -1778,8 +1831,16 @@ static int sd_done(struct scsi_cmnd *SCpnt)
int sense_deferred = 0;
unsigned char op = SCpnt->cmnd[0];
unsigned char unmap = SCpnt->cmnd[1] & 8;
+ unsigned char sa = SCpnt->cmnd[1] & 0xf;
- if (req_op(req) == REQ_OP_DISCARD || req_op(req) == REQ_OP_WRITE_SAME) {
+ switch(req_op(req)) {
+ case REQ_OP_DISCARD:
+ case REQ_OP_WRITE_SAME:
+ case REQ_OP_ZONE_REPORT:
+ case REQ_OP_ZONE_RESET:
+ case REQ_OP_ZONE_OPEN:
+ case REQ_OP_ZONE_CLOSE:
+ case REQ_OP_ZONE_FINISH:
if (!result) {
good_bytes = blk_rq_bytes(req);
scsi_set_resid(SCpnt, 0);
@@ -1787,6 +1848,7 @@ static int sd_done(struct scsi_cmnd *SCpnt)
good_bytes = 0;
scsi_set_resid(SCpnt, blk_rq_bytes(req));
}
+ break;
}
if (result) {
@@ -1829,6 +1891,10 @@ static int sd_done(struct scsi_cmnd *SCpnt)
case UNMAP:
sd_config_discard(sdkp, SD_LBP_DISABLE);
break;
+ case ZBC_OUT:
+ if (sa == ZO_RESET_WRITE_POINTER)
+ sd_config_discard(sdkp, SD_LBP_DISABLE);
+ break;
case WRITE_SAME_16:
case WRITE_SAME:
if (unmap)
@@ -1847,7 +1913,11 @@ static int sd_done(struct scsi_cmnd *SCpnt)
default:
break;
}
+
out:
+ if (sdkp->zoned == 1 || sdkp->device->type == TYPE_ZBC)
+ sd_zbc_done(SCpnt, &sshdr);
+
SCSI_LOG_HLCOMPLETE(1, scmd_printk(KERN_INFO, SCpnt,
"sd_done: completed %d of %d bytes\n",
good_bytes, scsi_bufflen(SCpnt)));
@@ -1982,7 +2052,6 @@ sd_spinup_disk(struct scsi_disk *sdkp)
}
}
-
/*
* Determine whether disk supports Data Integrity Field.
*/
@@ -2132,6 +2201,9 @@ static int read_capacity_16(struct scsi_disk *sdkp, struct scsi_device *sdp,
/* Logical blocks per physical block exponent */
sdkp->physical_block_size = (1 << (buffer[13] & 0xf)) * sector_size;
+ /* RC basis */
+ sdkp->rc_basis = (buffer[12] >> 4) & 0x3;
+
/* Lowest aligned logical block */
alignment = ((buffer[14] & 0x3f) << 8 | buffer[15]) * sector_size;
blk_queue_alignment_offset(sdp->request_queue, alignment);
@@ -2322,6 +2394,11 @@ got_data:
sector_size = 512;
}
blk_queue_logical_block_size(sdp->request_queue, sector_size);
+ blk_queue_physical_block_size(sdp->request_queue,
+ sdkp->physical_block_size);
+ sdkp->device->sector_size = sector_size;
+
+ sd_zbc_read_zones(sdkp, buffer);
{
char cap_str_2[10], cap_str_10[10];
@@ -2348,9 +2425,6 @@ got_data:
if (sdkp->capacity > 0xffffffff)
sdp->use_16_for_rw = 1;
- blk_queue_physical_block_size(sdp->request_queue,
- sdkp->physical_block_size);
- sdkp->device->sector_size = sector_size;
}
/* called with buffer of length 512 */
@@ -2612,7 +2686,7 @@ static void sd_read_app_tag_own(struct scsi_disk *sdkp, unsigned char *buffer)
struct scsi_mode_data data;
struct scsi_sense_hdr sshdr;
- if (sdp->type != TYPE_DISK)
+ if (sdp->type != TYPE_DISK && sdp->type != TYPE_ZBC)
return;
if (sdkp->protection_type == 0)
@@ -2719,6 +2793,7 @@ static void sd_read_block_limits(struct scsi_disk *sdkp)
*/
static void sd_read_block_characteristics(struct scsi_disk *sdkp)
{
+ struct request_queue *q = sdkp->disk->queue;
unsigned char *buffer;
u16 rot;
const int vpd_len = 64;
@@ -2733,10 +2808,21 @@ static void sd_read_block_characteristics(struct scsi_disk *sdkp)
rot = get_unaligned_be16(&buffer[4]);
if (rot == 1) {
- queue_flag_set_unlocked(QUEUE_FLAG_NONROT, sdkp->disk->queue);
- queue_flag_clear_unlocked(QUEUE_FLAG_ADD_RANDOM, sdkp->disk->queue);
+ queue_flag_set_unlocked(QUEUE_FLAG_NONROT, q);
+ queue_flag_clear_unlocked(QUEUE_FLAG_ADD_RANDOM, q);
}
+ sdkp->zoned = (buffer[8] >> 4) & 3;
+ if (sdkp->zoned == 1)
+ q->limits.zoned = BLK_ZONED_HA;
+ else if (sdkp->device->type == TYPE_ZBC)
+ q->limits.zoned = BLK_ZONED_HM;
+ else
+ q->limits.zoned = BLK_ZONED_NONE;
+ if (blk_queue_zoned(q) && sdkp->first_scan)
+ sd_printk(KERN_NOTICE, sdkp, "Host-%s zoned block device\n",
+ q->limits.zoned == BLK_ZONED_HM ? "managed" : "aware");
+
out:
kfree(buffer);
}
@@ -2835,14 +2921,14 @@ static int sd_revalidate_disk(struct gendisk *disk)
* react badly if we do.
*/
if (sdkp->media_present) {
- sd_read_capacity(sdkp, buffer);
-
if (scsi_device_supports_vpd(sdp)) {
sd_read_block_provisioning(sdkp);
sd_read_block_limits(sdkp);
sd_read_block_characteristics(sdkp);
}
+ sd_read_capacity(sdkp, buffer);
+
sd_read_write_protect_flag(sdkp, buffer);
sd_read_cache_type(sdkp, buffer);
sd_read_app_tag_own(sdkp, buffer);
@@ -3040,9 +3126,16 @@ static int sd_probe(struct device *dev)
scsi_autopm_get_device(sdp);
error = -ENODEV;
- if (sdp->type != TYPE_DISK && sdp->type != TYPE_MOD && sdp->type != TYPE_RBC)
+ if (sdp->type != TYPE_DISK &&
+ sdp->type != TYPE_ZBC &&
+ sdp->type != TYPE_MOD &&
+ sdp->type != TYPE_RBC)
goto out;
+#ifndef CONFIG_BLK_DEV_ZONED
+ if (sdp->type == TYPE_ZBC)
+ goto out;
+#endif
SCSI_LOG_HLQUEUE(3, sdev_printk(KERN_INFO, sdp,
"sd_probe\n"));
@@ -3146,6 +3239,8 @@ static int sd_remove(struct device *dev)
del_gendisk(sdkp->disk);
sd_shutdown(dev);
+ sd_zbc_remove(sdkp);
+
blk_register_region(devt, SD_MINORS, NULL,
sd_default_probe, NULL, NULL);
diff --git a/drivers/scsi/sd.h b/drivers/scsi/sd.h
index 765a6f1..3452871 100644
--- a/drivers/scsi/sd.h
+++ b/drivers/scsi/sd.h
@@ -56,6 +56,7 @@ enum {
SD_LBP_WS16, /* Use WRITE SAME(16) with UNMAP bit */
SD_LBP_WS10, /* Use WRITE SAME(10) with UNMAP bit */
SD_LBP_ZERO, /* Use WRITE SAME(10) with zero payload */
+ SD_ZBC_RESET_WP, /* Use RESET WRITE POINTER */
SD_LBP_DISABLE, /* Discard disabled due to failed cmd */
};
@@ -64,6 +65,11 @@ struct scsi_disk {
struct scsi_device *device;
struct device dev;
struct gendisk *disk;
+#ifdef CONFIG_BLK_DEV_ZONED
+ struct workqueue_struct *zone_work_q;
+ sector_t zone_sectors;
+ unsigned int nr_zones;
+#endif
atomic_t openers;
sector_t capacity; /* size in logical blocks */
u32 max_xfer_blocks;
@@ -94,6 +100,8 @@ struct scsi_disk {
unsigned lbpvpd : 1;
unsigned ws10 : 1;
unsigned ws16 : 1;
+ unsigned rc_basis: 2;
+ unsigned zoned: 2;
};
#define to_scsi_disk(obj) container_of(obj,struct scsi_disk,dev)
@@ -156,6 +164,13 @@ static inline unsigned int logical_to_bytes(struct scsi_device *sdev, sector_t b
return blocks * sdev->sector_size;
}
+static inline sector_t sectors_to_logical(struct scsi_device *sdev, sector_t sector)
+{
+ return sector >> (ilog2(sdev->sector_size) - 9);
+}
+
+extern void sd_config_discard(struct scsi_disk *, unsigned int);
+
/*
* A DIF-capable target device can be formatted with different
* protection schemes. Currently 0 through 3 are defined:
@@ -269,4 +284,57 @@ static inline void sd_dif_complete(struct scsi_cmnd *cmd, unsigned int a)
#endif /* CONFIG_BLK_DEV_INTEGRITY */
+#ifdef CONFIG_BLK_DEV_ZONED
+
+extern void sd_zbc_read_zones(struct scsi_disk *, char *);
+extern void sd_zbc_remove(struct scsi_disk *);
+extern int sd_zbc_setup_read_write(struct scsi_disk *, struct request *,
+ sector_t, unsigned int *);
+extern int sd_zbc_setup_report_cmnd(struct scsi_cmnd *);
+extern int sd_zbc_setup_reset_cmnd(struct scsi_cmnd *);
+extern int sd_zbc_setup_open_cmnd(struct scsi_cmnd *);
+extern int sd_zbc_setup_close_cmnd(struct scsi_cmnd *);
+extern int sd_zbc_setup_finish_cmnd(struct scsi_cmnd *);
+extern void sd_zbc_done(struct scsi_cmnd *, struct scsi_sense_hdr *);
+
+#else /* CONFIG_BLK_DEV_ZONED */
+
+static inline void sd_zbc_read_zones(struct scsi_disk *sdkp,
+ unsigned char *buf) {}
+static inline void sd_zbc_remove(struct scsi_disk *sdkp) {}
+
+static inline int sd_zbc_setup_read_write(struct scsi_disk *sdkp,
+ struct request *rq, sector_t sector,
+ unsigned int *num_sectors)
+{
+ /* Let the drive fail requests */
+ return BLKPREP_OK;
+}
+
+static inline int sd_zbc_setup_report_cmnd(struct scsi_cmnd *cmd)
+{
+ return BLKPREP_KILL;
+}
+static inline int sd_zbc_setup_reset_cmnd(struct scsi_cmnd *cmd)
+{
+ return BLKPREP_KILL;
+}
+static inline int sd_zbc_setup_open_cmnd(struct scsi_cmnd *cmd)
+{
+ return BLKPREP_KILL;
+}
+static inline int sd_zbc_setup_close_cmnd(struct scsi_cmnd *cmd)
+{
+ return BLKPREP_KILL;
+}
+static inline int sd_zbc_setup_finish_cmnd(struct scsi_cmnd *cmd)
+{
+ return BLKPREP_KILL;
+}
+
+static inline void sd_zbc_done(struct scsi_cmnd *cmd,
+ struct scsi_sense_hdr *sshdr) {}
+
+#endif /* CONFIG_BLK_DEV_ZONED */
+
#endif /* _SCSI_DISK_H */
diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
new file mode 100644
index 0000000..ec9c3fc
--- /dev/null
+++ b/drivers/scsi/sd_zbc.c
@@ -0,0 +1,1097 @@
+/*
+ * SCSI Zoned Block commands
+ *
+ * Copyright (C) 2014-2015 SUSE Linux GmbH
+ * Written by: Hannes Reinecke <hare@suse.de>
+ * Modified by: Damien Le Moal <damien.lemoal@hgst.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License version
+ * 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; see the file COPYING. If not, write to
+ * the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139,
+ * USA.
+ *
+ */
+
+#include <linux/blkdev.h>
+#include <linux/rbtree.h>
+
+#include <asm/unaligned.h>
+
+#include <scsi/scsi.h>
+#include <scsi/scsi_cmnd.h>
+#include <scsi/scsi_dbg.h>
+#include <scsi/scsi_device.h>
+#include <scsi/scsi_driver.h>
+#include <scsi/scsi_host.h>
+#include <scsi/scsi_eh.h>
+
+#include "sd.h"
+#include "scsi_priv.h"
+
+enum zbc_zone_type {
+ ZBC_ZONE_TYPE_CONV = 0x1,
+ ZBC_ZONE_TYPE_SEQWRITE_REQ,
+ ZBC_ZONE_TYPE_SEQWRITE_PREF,
+ ZBC_ZONE_TYPE_RESERVED,
+};
+
+enum zbc_zone_cond {
+ ZBC_ZONE_COND_NO_WP,
+ ZBC_ZONE_COND_EMPTY,
+ ZBC_ZONE_COND_IMP_OPEN,
+ ZBC_ZONE_COND_EXP_OPEN,
+ ZBC_ZONE_COND_CLOSED,
+ ZBC_ZONE_COND_READONLY = 0xd,
+ ZBC_ZONE_COND_FULL,
+ ZBC_ZONE_COND_OFFLINE,
+};
+
+#define SD_ZBC_BUF_SIZE 131072
+
+#define sd_zbc_debug(sdkp, fmt, args...) \
+ pr_debug("%s %s [%s]: " fmt, \
+ dev_driver_string(&(sdkp)->device->sdev_gendev), \
+ dev_name(&(sdkp)->device->sdev_gendev), \
+ (sdkp)->disk->disk_name, ## args)
+
+#define sd_zbc_debug_ratelimit(sdkp, fmt, args...) \
+ do { \
+ if (printk_ratelimit()) \
+ sd_zbc_debug(sdkp, fmt, ## args); \
+ } while( 0 )
+
+#define sd_zbc_err(sdkp, fmt, args...) \
+ pr_err("%s %s [%s]: " fmt, \
+ dev_driver_string(&(sdkp)->device->sdev_gendev), \
+ dev_name(&(sdkp)->device->sdev_gendev), \
+ (sdkp)->disk->disk_name, ## args)
+
+struct zbc_zone_work {
+ struct work_struct zone_work;
+ struct scsi_disk *sdkp;
+ sector_t sector;
+ sector_t nr_sects;
+ bool init;
+ unsigned int nr_zones;
+};
+
+struct blk_zone *zbc_desc_to_zone(struct scsi_disk *sdkp, unsigned char *rec)
+{
+ struct blk_zone *zone;
+
+ zone = kzalloc(sizeof(struct blk_zone), GFP_KERNEL);
+ if (!zone)
+ return NULL;
+
+ /* Zone type */
+ switch(rec[0] & 0x0f) {
+ case ZBC_ZONE_TYPE_CONV:
+ case ZBC_ZONE_TYPE_SEQWRITE_REQ:
+ case ZBC_ZONE_TYPE_SEQWRITE_PREF:
+ zone->type = rec[0] & 0x0f;
+ break;
+ default:
+ zone->type = BLK_ZONE_TYPE_UNKNOWN;
+ break;
+ }
+
+ /* Zone condition */
+ zone->cond = (rec[1] >> 4) & 0xf;
+ if (rec[1] & 0x01)
+ zone->reset = 1;
+ if (rec[1] & 0x02)
+ zone->non_seq = 1;
+
+ /* Zone start sector and length */
+ zone->len = logical_to_sectors(sdkp->device,
+ get_unaligned_be64(&rec[8]));
+ zone->start = logical_to_sectors(sdkp->device,
+ get_unaligned_be64(&rec[16]));
+
+ /* Zone write pointer */
+ if (blk_zone_is_empty(zone) &&
+ zone->wp != zone->start)
+ zone->wp = zone->start;
+ else if (blk_zone_is_full(zone))
+ zone->wp = zone->start + zone->len;
+ else if (blk_zone_is_seq(zone))
+ zone->wp = logical_to_sectors(sdkp->device,
+ get_unaligned_be64(&rec[24]));
+ else
+ zone->wp = (sector_t)-1;
+
+ return zone;
+}
+
+static int zbc_parse_zones(struct scsi_disk *sdkp, unsigned char *buf,
+ unsigned int buf_len, sector_t *next_sector)
+{
+ struct request_queue *q = sdkp->disk->queue;
+ sector_t capacity = logical_to_sectors(sdkp->device, sdkp->capacity);
+ unsigned char *rec = buf;
+ unsigned int zone_len, list_length;
+
+ /* Parse REPORT ZONES header */
+ list_length = get_unaligned_be32(&buf[0]);
+ rec = buf + 64;
+ list_length += 64;
+
+ if (list_length < buf_len)
+ buf_len = list_length;
+
+ /* Parse REPORT ZONES zone descriptors */
+ *next_sector = capacity;
+ while (rec < buf + buf_len) {
+
+ struct blk_zone *new, *old;
+
+ new = zbc_desc_to_zone(sdkp, rec);
+ if (!new)
+ return -ENOMEM;
+
+ zone_len = new->len;
+ *next_sector = new->start + zone_len;
+
+ old = blk_insert_zone(q, new);
+ if (old) {
+ blk_lock_zone(old);
+
+ /*
+ * Always update the zone state flags and the zone
+ * offline and read-only condition as the drive may
+ * change those independently of the commands being
+ * executed
+ */
+ old->reset = new->reset;
+ old->non_seq = new->non_seq;
+ if (blk_zone_is_offline(new) ||
+ blk_zone_is_readonly(new))
+ old->cond = new->cond;
+
+ if (blk_zone_in_update(old)) {
+ old->cond = new->cond;
+ old->wp = new->wp;
+ blk_clear_zone_update(old);
+ }
+
+ blk_unlock_zone(old);
+
+ kfree(new);
+ }
+
+ rec += 64;
+
+ }
+
+ return 0;
+}
+
+/**
+ * sd_zbc_report_zones - Issue a REPORT ZONES scsi command
+ * @sdkp: SCSI disk to which the command should be send
+ * @buffer: response buffer
+ * @bufflen: length of @buffer
+ * @start_sector: logical sector for the zone information should be reported
+ * @option: reporting option to be used
+ * @partial: flag to set the 'partial' bit for report zones command
+ */
+int sd_zbc_report_zones(struct scsi_disk *sdkp, unsigned char *buffer,
+ int bufflen, sector_t start_sector,
+ enum zbc_zone_reporting_options option, bool partial)
+{
+ struct scsi_device *sdp = sdkp->device;
+ const int timeout = sdp->request_queue->rq_timeout;
+ struct scsi_sense_hdr sshdr;
+ sector_t start_lba = sectors_to_logical(sdkp->device, start_sector);
+ unsigned char cmd[16];
+ int result;
+
+ if (!scsi_device_online(sdp))
+ return -ENODEV;
+
+ sd_zbc_debug(sdkp, "REPORT ZONES lba %zu len %d\n",
+ start_lba, bufflen);
+
+ memset(cmd, 0, 16);
+ cmd[0] = ZBC_IN;
+ cmd[1] = ZI_REPORT_ZONES;
+ put_unaligned_be64(start_lba, &cmd[2]);
+ put_unaligned_be32(bufflen, &cmd[10]);
+ cmd[14] = (partial ? ZBC_REPORT_ZONE_PARTIAL : 0) | option;
+ memset(buffer, 0, bufflen);
+
+ result = scsi_execute_req(sdp, cmd, DMA_FROM_DEVICE,
+ buffer, bufflen, &sshdr,
+ timeout, SD_MAX_RETRIES, NULL);
+
+ if (result) {
+ sd_zbc_err(sdkp,
+ "REPORT ZONES lba %zu failed with %d/%d\n",
+ start_lba, host_byte(result), driver_byte(result));
+ return -EIO;
+ }
+
+ return 0;
+}
+
+/**
+ * Set or clear the update flag of all zones contained
+ * in the range sector..sector+nr_sects.
+ * Return the number of zones marked/cleared.
+ */
+static int __sd_zbc_zones_updating(struct scsi_disk *sdkp,
+ sector_t sector, sector_t nr_sects,
+ bool set)
+{
+ struct request_queue *q = sdkp->disk->queue;
+ struct blk_zone *zone;
+ struct rb_node *node;
+ unsigned long flags;
+ int nr_zones = 0;
+
+ if (!nr_sects) {
+ /* All zones */
+ sector = 0;
+ nr_sects = logical_to_sectors(sdkp->device, sdkp->capacity);
+ }
+
+ spin_lock_irqsave(&q->zones_lock, flags);
+ for (node = rb_first(&q->zones); node && nr_sects; node = rb_next(node)) {
+ zone = rb_entry(node, struct blk_zone, node);
+ if (sector < zone->start || sector >= (zone->start + zone->len))
+ continue;
+ if (set) {
+ if (!test_and_set_bit_lock(BLK_ZONE_IN_UPDATE, &zone->flags))
+ nr_zones++;
+ } else if (test_and_clear_bit(BLK_ZONE_IN_UPDATE, &zone->flags)) {
+ wake_up_bit(&zone->flags, BLK_ZONE_IN_UPDATE);
+ nr_zones++;
+ }
+ sector = zone->start + zone->len;
+ if (nr_sects <= zone->len)
+ nr_sects = 0;
+ else
+ nr_sects -= zone->len;
+ }
+ spin_unlock_irqrestore(&q->zones_lock, flags);
+
+ return nr_zones;
+}
+
+static inline int sd_zbc_set_zones_updating(struct scsi_disk *sdkp,
+ sector_t sector, sector_t nr_sects)
+{
+ return __sd_zbc_zones_updating(sdkp, sector, nr_sects, true);
+}
+
+static inline int sd_zbc_clear_zones_updating(struct scsi_disk *sdkp,
+ sector_t sector, sector_t nr_sects)
+{
+ return __sd_zbc_zones_updating(sdkp, sector, nr_sects, false);
+}
+
+static void sd_zbc_start_queue(struct request_queue *q)
+{
+ unsigned long flags;
+
+ if (q->mq_ops) {
+ blk_mq_start_hw_queues(q);
+ } else {
+ spin_lock_irqsave(q->queue_lock, flags);
+ blk_start_queue(q);
+ spin_unlock_irqrestore(q->queue_lock, flags);
+ }
+}
+
+static void sd_zbc_update_zone_work(struct work_struct *work)
+{
+ struct zbc_zone_work *zwork =
+ container_of(work, struct zbc_zone_work, zone_work);
+ struct scsi_disk *sdkp = zwork->sdkp;
+ sector_t capacity = logical_to_sectors(sdkp->device, sdkp->capacity);
+ struct request_queue *q = sdkp->disk->queue;
+ sector_t end_sector, sector = zwork->sector;
+ unsigned int bufsize;
+ unsigned char *buf;
+ int ret = -ENOMEM;
+
+ /* Get a buffer */
+ if (!zwork->nr_zones) {
+ bufsize = SD_ZBC_BUF_SIZE;
+ } else {
+ bufsize = (zwork->nr_zones + 1) * 64;
+ if (bufsize < 512)
+ bufsize = 512;
+ else if (bufsize > SD_ZBC_BUF_SIZE)
+ bufsize = SD_ZBC_BUF_SIZE;
+ else
+ bufsize = (bufsize + 511) & ~511;
+ }
+ buf = kmalloc(bufsize, GFP_KERNEL | GFP_DMA);
+ if (!buf) {
+ sd_zbc_err(sdkp, "Failed to allocate zone report buffer\n");
+ goto done_free;
+ }
+
+ /* Process sector range */
+ end_sector = zwork->sector + zwork->nr_sects;
+ while(sector < min(end_sector, capacity)) {
+
+ /* Get zone report */
+ ret = sd_zbc_report_zones(sdkp, buf, bufsize, sector,
+ ZBC_ZONE_REPORTING_OPTION_ALL, true);
+ if (ret)
+ break;
+
+ ret = zbc_parse_zones(sdkp, buf, bufsize, §or);
+ if (ret)
+ break;
+
+ /* Kick start the queue to allow requests waiting */
+ /* for the zones just updated to run */
+ sd_zbc_start_queue(q);
+
+ }
+
+done_free:
+ if (ret)
+ sd_zbc_clear_zones_updating(sdkp, zwork->sector, zwork->nr_sects);
+ if (buf)
+ kfree(buf);
+ kfree(zwork);
+}
+
+/**
+ * sd_zbc_update_zones - Update zone information for zones starting
+ * from @start_sector. If not in init mode, the update is done only
+ * for zones marked with update flag.
+ * @sdkp: SCSI disk for which the zone information needs to be updated
+ * @start_sector: First sector of the first zone to be updated
+ * @bufsize: buffersize to be allocated for report zones
+ */
+static int sd_zbc_update_zones(struct scsi_disk *sdkp,
+ sector_t sector, sector_t nr_sects,
+ gfp_t gfpflags, bool init)
+{
+ struct zbc_zone_work *zwork;
+
+ zwork = kzalloc(sizeof(struct zbc_zone_work), gfpflags);
+ if (!zwork) {
+ sd_zbc_err(sdkp, "Failed to allocate zone work\n");
+ return -ENOMEM;
+ }
+
+ if (!nr_sects) {
+ /* All zones */
+ sector = 0;
+ nr_sects = logical_to_sectors(sdkp->device, sdkp->capacity);
+ }
+
+ INIT_WORK(&zwork->zone_work, sd_zbc_update_zone_work);
+ zwork->sdkp = sdkp;
+ zwork->sector = sector;
+ zwork->nr_sects = nr_sects;
+ zwork->init = init;
+
+ if (!init)
+ /* Mark the zones falling in the report as updating */
+ zwork->nr_zones = sd_zbc_set_zones_updating(sdkp, sector, nr_sects);
+
+ if (init || zwork->nr_zones)
+ queue_work(sdkp->zone_work_q, &zwork->zone_work);
+ else
+ kfree(zwork);
+
+ return 0;
+}
+
+int sd_zbc_setup_report_cmnd(struct scsi_cmnd *cmd)
+{
+ struct request *rq = cmd->request;
+ struct gendisk *disk = rq->rq_disk;
+ struct scsi_disk *sdkp = scsi_disk(disk);
+ int ret;
+
+ if (!sdkp->zone_work_q)
+ return BLKPREP_KILL;
+
+ ret = sd_zbc_update_zones(sdkp, blk_rq_pos(rq), blk_rq_sectors(rq),
+ GFP_ATOMIC, false);
+ if (unlikely(ret))
+ return BLKPREP_DEFER;
+
+ return BLKPREP_DONE;
+}
+
+static void sd_zbc_setup_action_cmnd(struct scsi_cmnd *cmd,
+ u8 action,
+ bool all)
+{
+ struct request *rq = cmd->request;
+ struct scsi_disk *sdkp = scsi_disk(rq->rq_disk);
+ sector_t lba;
+
+ cmd->cmd_len = 16;
+ cmd->cmnd[0] = ZBC_OUT;
+ cmd->cmnd[1] = action;
+ if (all) {
+ cmd->cmnd[14] |= 0x01;
+ } else {
+ lba = sectors_to_logical(sdkp->device, blk_rq_pos(rq));
+ put_unaligned_be64(lba, &cmd->cmnd[2]);
+ }
+
+ rq->completion_data = NULL;
+ rq->timeout = SD_TIMEOUT;
+ rq->__data_len = blk_rq_bytes(rq);
+
+ /* Don't retry */
+ cmd->allowed = 0;
+ cmd->transfersize = 0;
+ cmd->sc_data_direction = DMA_NONE;
+}
+
+int sd_zbc_setup_reset_cmnd(struct scsi_cmnd *cmd)
+{
+ struct request *rq = cmd->request;
+ struct scsi_disk *sdkp = scsi_disk(rq->rq_disk);
+ sector_t sector = blk_rq_pos(rq);
+ sector_t nr_sects = blk_rq_sectors(rq);
+ struct blk_zone *zone = NULL;
+ int ret = BLKPREP_OK;
+
+ if (nr_sects) {
+ zone = blk_lookup_zone(rq->q, sector);
+ if (!zone)
+ return BLKPREP_KILL;
+ }
+
+ if (zone) {
+
+ blk_lock_zone(zone);
+
+ /* If the zone is being updated, wait */
+ if (blk_zone_in_update(zone)) {
+ ret = BLKPREP_DEFER;
+ goto out;
+ }
+
+ if (zone->type == BLK_ZONE_TYPE_UNKNOWN) {
+ sd_zbc_debug(sdkp,
+ "Discarding unknown zone %zu\n",
+ zone->start);
+ ret = BLKPREP_KILL;
+ goto out;
+ }
+
+ /* Nothing to do for conventional sequential zones */
+ if (blk_zone_is_conv(zone)) {
+ ret = BLKPREP_DONE;
+ goto out;
+ }
+
+ if (!blk_try_write_lock_zone(zone)) {
+ ret = BLKPREP_DEFER;
+ goto out;
+ }
+
+ /* Nothing to do if the zone is already empty */
+ if (blk_zone_is_empty(zone)) {
+ blk_write_unlock_zone(zone);
+ ret = BLKPREP_DONE;
+ goto out;
+ }
+
+ if (sector != zone->start ||
+ (nr_sects != zone->len)) {
+ sd_printk(KERN_ERR, sdkp,
+ "Unaligned reset wp request, start %zu/%zu"
+ " len %zu/%zu\n",
+ zone->start, sector, zone->len, nr_sects);
+ blk_write_unlock_zone(zone);
+ ret = BLKPREP_KILL;
+ goto out;
+ }
+
+ }
+
+ sd_zbc_setup_action_cmnd(cmd, ZO_RESET_WRITE_POINTER, !zone);
+
+out:
+ if (zone) {
+ if (ret == BLKPREP_OK) {
+ /*
+ * Opportunistic update. Will be fixed up
+ * with zone update if the command fails,
+ */
+ zone->wp = zone->start;
+ zone->cond = BLK_ZONE_COND_EMPTY;
+ zone->reset = 0;
+ zone->non_seq = 0;
+ }
+ blk_unlock_zone(zone);
+ }
+
+ return ret;
+}
+
+int sd_zbc_setup_open_cmnd(struct scsi_cmnd *cmd)
+{
+ struct request *rq = cmd->request;
+ struct scsi_disk *sdkp = scsi_disk(rq->rq_disk);
+ sector_t sector = blk_rq_pos(rq);
+ sector_t nr_sects = blk_rq_sectors(rq);
+ struct blk_zone *zone = NULL;
+ int ret = BLKPREP_OK;
+
+ if (nr_sects) {
+ zone = blk_lookup_zone(rq->q, sector);
+ if (!zone)
+ return BLKPREP_KILL;
+ }
+
+ if (zone) {
+
+ blk_lock_zone(zone);
+
+ /* If the zone is being updated, wait */
+ if (blk_zone_in_update(zone)) {
+ ret = BLKPREP_DEFER;
+ goto out;
+ }
+
+ if (zone->type == BLK_ZONE_TYPE_UNKNOWN) {
+ sd_zbc_debug(sdkp,
+ "Opening unknown zone %zu\n",
+ zone->start);
+ ret = BLKPREP_KILL;
+ goto out;
+ }
+
+ /*
+ * Nothing to do for conventional zones,
+ * zones already open or full zones.
+ */
+ if (blk_zone_is_conv(zone) ||
+ blk_zone_is_open(zone) ||
+ blk_zone_is_full(zone)) {
+ ret = BLKPREP_DONE;
+ goto out;
+ }
+
+ if (sector != zone->start ||
+ (nr_sects != zone->len)) {
+ sd_printk(KERN_ERR, sdkp,
+ "Unaligned open zone request, start %zu/%zu"
+ " len %zu/%zu\n",
+ zone->start, sector, zone->len, nr_sects);
+ ret = BLKPREP_KILL;
+ goto out;
+ }
+
+ }
+
+ sd_zbc_setup_action_cmnd(cmd, ZO_OPEN_ZONE, !zone);
+
+out:
+ if (zone) {
+ if (ret == BLKPREP_OK)
+ /*
+ * Opportunistic update. Will be fixed up
+ * with zone update if the command fails.
+ */
+ zone->cond = BLK_ZONE_COND_EXP_OPEN;
+ blk_unlock_zone(zone);
+ }
+
+ return ret;
+}
+
+int sd_zbc_setup_close_cmnd(struct scsi_cmnd *cmd)
+{
+ struct request *rq = cmd->request;
+ struct scsi_disk *sdkp = scsi_disk(rq->rq_disk);
+ sector_t sector = blk_rq_pos(rq);
+ sector_t nr_sects = blk_rq_sectors(rq);
+ struct blk_zone *zone = NULL;
+ int ret = BLKPREP_OK;
+
+ if (nr_sects) {
+ zone = blk_lookup_zone(rq->q, sector);
+ if (!zone)
+ return BLKPREP_KILL;
+ }
+
+ if (zone) {
+
+ blk_lock_zone(zone);
+
+ /* If the zone is being updated, wait */
+ if (blk_zone_in_update(zone)) {
+ ret = BLKPREP_DEFER;
+ goto out;
+ }
+
+ if (zone->type == BLK_ZONE_TYPE_UNKNOWN) {
+ sd_zbc_debug(sdkp,
+ "Closing unknown zone %zu\n",
+ zone->start);
+ ret = BLKPREP_KILL;
+ goto out;
+ }
+
+ /*
+ * Nothing to do for conventional zones,
+ * full zones or empty zones.
+ */
+ if (blk_zone_is_conv(zone) ||
+ blk_zone_is_full(zone) ||
+ blk_zone_is_empty(zone)) {
+ ret = BLKPREP_DONE;
+ goto out;
+ }
+
+ if (sector != zone->start ||
+ (nr_sects != zone->len)) {
+ sd_printk(KERN_ERR, sdkp,
+ "Unaligned close zone request, start %zu/%zu"
+ " len %zu/%zu\n",
+ zone->start, sector, zone->len, nr_sects);
+ ret = BLKPREP_KILL;
+ goto out;
+ }
+
+ }
+
+ sd_zbc_setup_action_cmnd(cmd, ZO_CLOSE_ZONE, !zone);
+
+out:
+ if (zone) {
+ if (ret == BLKPREP_OK)
+ /*
+ * Opportunistic update. Will be fixed up
+ * with zone update if the command fails.
+ */
+ zone->cond = BLK_ZONE_COND_CLOSED;
+ blk_unlock_zone(zone);
+ }
+
+ return ret;
+}
+
+int sd_zbc_setup_finish_cmnd(struct scsi_cmnd *cmd)
+{
+ struct request *rq = cmd->request;
+ struct scsi_disk *sdkp = scsi_disk(rq->rq_disk);
+ sector_t sector = blk_rq_pos(rq);
+ sector_t nr_sects = blk_rq_sectors(rq);
+ struct blk_zone *zone = NULL;
+ int ret = BLKPREP_OK;
+
+ if (nr_sects) {
+ zone = blk_lookup_zone(rq->q, sector);
+ if (!zone)
+ return BLKPREP_KILL;
+ }
+
+ if (zone) {
+
+ blk_lock_zone(zone);
+
+ /* If the zone is being updated, wait */
+ if (blk_zone_in_update(zone)) {
+ ret = BLKPREP_DEFER;
+ goto out;
+ }
+
+ if (zone->type == BLK_ZONE_TYPE_UNKNOWN) {
+ sd_zbc_debug(sdkp,
+ "Finishing unknown zone %zu\n",
+ zone->start);
+ ret = BLKPREP_KILL;
+ goto out;
+ }
+
+ /* Nothing to do for conventional zones and full zones */
+ if (blk_zone_is_conv(zone) ||
+ blk_zone_is_full(zone)) {
+ ret = BLKPREP_DONE;
+ goto out;
+ }
+
+ if (sector != zone->start ||
+ (nr_sects != zone->len)) {
+ sd_printk(KERN_ERR, sdkp,
+ "Unaligned finish zone request, start %zu/%zu"
+ " len %zu/%zu\n",
+ zone->start, sector, zone->len, nr_sects);
+ ret = BLKPREP_KILL;
+ goto out;
+ }
+
+ }
+
+ sd_zbc_setup_action_cmnd(cmd, ZO_FINISH_ZONE, !zone);
+
+out:
+ if (zone) {
+ if (ret == BLKPREP_OK) {
+ /*
+ * Opportunistic update. Will be fixed up
+ * with zone update if the command fails.
+ */
+ zone->cond = BLK_ZONE_COND_FULL;
+ if (blk_zone_is_seq(zone))
+ zone->wp = zone->start + zone->len;
+ }
+ blk_unlock_zone(zone);
+ }
+
+ return ret;
+}
+
+int sd_zbc_setup_read_write(struct scsi_disk *sdkp, struct request *rq,
+ sector_t sector, unsigned int *num_sectors)
+{
+ struct blk_zone *zone;
+ unsigned int sectors = *num_sectors;
+ int ret = BLKPREP_OK;
+
+ zone = blk_lookup_zone(rq->q, sector);
+ if (!zone)
+ /* Let the drive handle the request */
+ return BLKPREP_OK;
+
+ blk_lock_zone(zone);
+
+ /* If the zone is being updated, wait */
+ if (blk_zone_in_update(zone)) {
+ ret = BLKPREP_DEFER;
+ goto out;
+ }
+
+ if (zone->type == BLK_ZONE_TYPE_UNKNOWN) {
+ sd_zbc_debug(sdkp,
+ "Unknown zone %zu\n",
+ zone->start);
+ ret = BLKPREP_KILL;
+ goto out;
+ }
+
+ /* For offline and read-only zones, let the drive fail the command */
+ if (blk_zone_is_offline(zone) ||
+ blk_zone_is_readonly(zone))
+ goto out;
+
+ /* Do not allow zone boundaries crossing */
+ if (sector + sectors > zone->start + zone->len) {
+ ret = BLKPREP_KILL;
+ goto out;
+ }
+
+ /* For conventional zones, no checks */
+ if (blk_zone_is_conv(zone))
+ goto out;
+
+ if (req_op(rq) == REQ_OP_WRITE ||
+ req_op(rq) == REQ_OP_WRITE_SAME) {
+
+ /*
+ * Write requests may change the write pointer and
+ * transition the zone condition to full. Changes
+ * are oportunistic here. If the request fails, a
+ * zone update will fix the zone information.
+ */
+ if (blk_zone_is_seq_req(zone)) {
+
+ /*
+ * Do not issue more than one write at a time per
+ * zone. This solves write ordering problems due to
+ * the unlocking of the request queue in the dispatch
+ * path in the non scsi-mq case. For scsi-mq, this
+ * also avoids potential write reordering when multiple
+ * threads running on different CPUs write to the same
+ * zone (with a synchronized sequential pattern).
+ */
+ if (!blk_try_write_lock_zone(zone)) {
+ ret = BLKPREP_DEFER;
+ goto out;
+ }
+
+ /* For host-managed drives, writes are allowed */
+ /* only at the write pointer position. */
+ if (zone->wp != sector) {
+ blk_write_unlock_zone(zone);
+ ret = BLKPREP_KILL;
+ goto out;
+ }
+
+ zone->wp += sectors;
+ if (zone->wp >= zone->start + zone->len) {
+ zone->cond = BLK_ZONE_COND_FULL;
+ zone->wp = zone->start + zone->len;
+ }
+
+ } else {
+
+ /* For host-aware drives, writes are allowed */
+ /* anywhere in the zone, but wp can only go */
+ /* forward. */
+ sector_t end_sector = sector + sectors;
+ if (sector == zone->wp &&
+ end_sector >= zone->start + zone->len) {
+ zone->cond = BLK_ZONE_COND_FULL;
+ zone->wp = zone->start + zone->len;
+ } else if (end_sector > zone->wp) {
+ zone->wp = end_sector;
+ }
+
+ }
+
+ } else {
+
+ /* Check read after write pointer */
+ if (sector + sectors <= zone->wp)
+ goto out;
+
+ if (zone->wp <= sector) {
+ /* Read beyond WP: clear request buffer */
+ struct req_iterator iter;
+ struct bio_vec bvec;
+ unsigned long flags;
+ void *buf;
+ rq_for_each_segment(bvec, rq, iter) {
+ buf = bvec_kmap_irq(&bvec, &flags);
+ memset(buf, 0, bvec.bv_len);
+ flush_dcache_page(bvec.bv_page);
+ bvec_kunmap_irq(buf, &flags);
+ }
+ ret = BLKPREP_DONE;
+ goto out;
+ }
+
+ /* Read straddle WP position: limit request size */
+ *num_sectors = zone->wp - sector;
+
+ }
+
+out:
+ blk_unlock_zone(zone);
+
+ return ret;
+}
+
+void sd_zbc_done(struct scsi_cmnd *cmd,
+ struct scsi_sense_hdr *sshdr)
+{
+ int result = cmd->result;
+ struct request *rq = cmd->request;
+ struct scsi_disk *sdkp = scsi_disk(rq->rq_disk);
+ struct request_queue *q = sdkp->disk->queue;
+ sector_t pos = blk_rq_pos(rq);
+ struct blk_zone *zone = NULL;
+ bool write_unlock = false;
+
+ /*
+ * Get the target zone of commands of interest. Some may
+ * apply to all zones so check the request sectors first.
+ */
+ switch (req_op(rq)) {
+ case REQ_OP_DISCARD:
+ case REQ_OP_WRITE:
+ case REQ_OP_WRITE_SAME:
+ case REQ_OP_ZONE_RESET:
+ write_unlock = true;
+ /* fallthru */
+ case REQ_OP_ZONE_OPEN:
+ case REQ_OP_ZONE_CLOSE:
+ case REQ_OP_ZONE_FINISH:
+ if (blk_rq_sectors(rq))
+ zone = blk_lookup_zone(q, pos);
+ break;
+ }
+
+ if (zone && write_unlock)
+ blk_write_unlock_zone(zone);
+
+ if (!result)
+ return;
+
+ if (sshdr->sense_key == ILLEGAL_REQUEST &&
+ sshdr->asc == 0x21)
+ /*
+ * It is unlikely that retrying requests failed with any
+ * kind of alignement error will result in success. So don't
+ * try. Report the error back to the user quickly so that
+ * corrective actions can be taken after obtaining updated
+ * zone information.
+ */
+ cmd->allowed = 0;
+
+ /* On error, force an update unless this is a failed report */
+ if (req_op(rq) == REQ_OP_ZONE_REPORT)
+ sd_zbc_clear_zones_updating(sdkp, pos, blk_rq_sectors(rq));
+ else if (zone)
+ sd_zbc_update_zones(sdkp, zone->start, zone->len,
+ GFP_ATOMIC, false);
+}
+
+void sd_zbc_read_zones(struct scsi_disk *sdkp, char *buf)
+{
+ struct request_queue *q = sdkp->disk->queue;
+ struct blk_zone *zone;
+ sector_t capacity;
+ sector_t sector;
+ bool init = false;
+ u32 rep_len;
+ int ret = 0;
+
+ if (sdkp->zoned != 1 && sdkp->device->type != TYPE_ZBC)
+ /*
+ * Device managed or normal SCSI disk,
+ * no special handling required
+ */
+ return;
+
+ /* Do a report zone to get the maximum LBA to check capacity */
+ ret = sd_zbc_report_zones(sdkp, buf, SD_BUF_SIZE,
+ 0, ZBC_ZONE_REPORTING_OPTION_ALL, false);
+ if (ret < 0)
+ return;
+
+ rep_len = get_unaligned_be32(&buf[0]);
+ if (rep_len < 64) {
+ sd_printk(KERN_WARNING, sdkp,
+ "REPORT ZONES report invalid length %u\n",
+ rep_len);
+ return;
+ }
+
+ if (sdkp->rc_basis == 0) {
+ /* The max_lba field is the capacity of this device */
+ sector_t lba = get_unaligned_be64(&buf[8]);
+ if (lba + 1 > sdkp->capacity) {
+ if (sdkp->first_scan)
+ sd_printk(KERN_WARNING, sdkp,
+ "Changing capacity from %zu "
+ "to max LBA+1 %zu\n",
+ sdkp->capacity,
+ (sector_t) lba + 1);
+ sdkp->capacity = lba + 1;
+ }
+ }
+
+ /* Setup the zone work queue */
+ if (! sdkp->zone_work_q) {
+ sdkp->zone_work_q =
+ alloc_ordered_workqueue("zbc_wq_%s", WQ_MEM_RECLAIM,
+ sdkp->disk->disk_name);
+ if (!sdkp->zone_work_q) {
+ sdev_printk(KERN_WARNING, sdkp->device,
+ "Create zoned disk workqueue failed\n");
+ return;
+ }
+ init = true;
+ }
+
+ /*
+ * Parse what we already got. If all zones are not parsed yet,
+ * kick start an update to get the remaining.
+ */
+ capacity = logical_to_sectors(sdkp->device, sdkp->capacity);
+ ret = zbc_parse_zones(sdkp, buf, SD_BUF_SIZE, §or);
+ if (ret == 0 && sector < capacity) {
+ sd_zbc_update_zones(sdkp, sector, capacity - sector,
+ GFP_KERNEL, init);
+ drain_workqueue(sdkp->zone_work_q);
+ }
+ if (ret)
+ return;
+
+ /*
+ * Analyze the zones layout: if all zones are the same size and
+ * the size is a power of 2, chunk the device and map discard to
+ * reset write pointer command. Otherwise, disable discard.
+ */
+ sdkp->zone_sectors = 0;
+ sdkp->nr_zones = 0;
+ sector = 0;
+ while(sector < capacity) {
+
+ zone = blk_lookup_zone(q, sector);
+ if (!zone) {
+ sdkp->zone_sectors = 0;
+ sdkp->nr_zones = 0;
+ break;
+ }
+
+ sector += zone->len;
+
+ if (sdkp->zone_sectors == 0) {
+ sdkp->zone_sectors = zone->len;
+ } else if (sector != capacity &&
+ zone->len != sdkp->zone_sectors) {
+ sdkp->zone_sectors = 0;
+ sdkp->nr_zones = 0;
+ break;
+ }
+
+ sdkp->nr_zones++;
+
+ }
+
+ if (!sdkp->zone_sectors ||
+ !is_power_of_2(sdkp->zone_sectors)) {
+ sd_config_discard(sdkp, SD_LBP_DISABLE);
+ if (sdkp->first_scan)
+ sd_printk(KERN_NOTICE, sdkp,
+ "%u zones (non constant zone size)\n",
+ sdkp->nr_zones);
+ return;
+ }
+
+ /* Setup discard granularity to the zone size */
+ blk_queue_chunk_sectors(sdkp->disk->queue, sdkp->zone_sectors);
+ sdkp->max_unmap_blocks = sdkp->zone_sectors;
+ sdkp->unmap_alignment = sectors_to_logical(sdkp->device,
+ sdkp->zone_sectors);
+ sdkp->unmap_granularity = sdkp->unmap_alignment;
+ sd_config_discard(sdkp, SD_ZBC_RESET_WP);
+
+ if (sdkp->first_scan) {
+ if (sdkp->nr_zones * sdkp->zone_sectors == capacity)
+ sd_printk(KERN_NOTICE, sdkp,
+ "%u zones of %zu sectors\n",
+ sdkp->nr_zones,
+ sdkp->zone_sectors);
+ else
+ sd_printk(KERN_NOTICE, sdkp,
+ "%u zones of %zu sectors "
+ "+ 1 runt zone\n",
+ sdkp->nr_zones - 1,
+ sdkp->zone_sectors);
+ }
+}
+
+void sd_zbc_remove(struct scsi_disk *sdkp)
+{
+
+ sd_config_discard(sdkp, SD_LBP_DISABLE);
+
+ if (sdkp->zone_work_q) {
+ drain_workqueue(sdkp->zone_work_q);
+ destroy_workqueue(sdkp->zone_work_q);
+ sdkp->zone_work_q = NULL;
+ blk_drop_zones(sdkp->disk->queue);
+ }
+}
+
diff --git a/include/scsi/scsi_proto.h b/include/scsi/scsi_proto.h
index d1defd1..6ba66e0 100644
--- a/include/scsi/scsi_proto.h
+++ b/include/scsi/scsi_proto.h
@@ -299,4 +299,21 @@ struct scsi_lun {
#define SCSI_ACCESS_STATE_MASK 0x0f
#define SCSI_ACCESS_STATE_PREFERRED 0x80
+/* Reporting options for REPORT ZONES */
+enum zbc_zone_reporting_options {
+ ZBC_ZONE_REPORTING_OPTION_ALL = 0,
+ ZBC_ZONE_REPORTING_OPTION_EMPTY,
+ ZBC_ZONE_REPORTING_OPTION_IMPLICIT_OPEN,
+ ZBC_ZONE_REPORTING_OPTION_EXPLICIT_OPEN,
+ ZBC_ZONE_REPORTING_OPTION_CLOSED,
+ ZBC_ZONE_REPORTING_OPTION_FULL,
+ ZBC_ZONE_REPORTING_OPTION_READONLY,
+ ZBC_ZONE_REPORTING_OPTION_OFFLINE,
+ ZBC_ZONE_REPORTING_OPTION_NEED_RESET_WP = 0x10,
+ ZBC_ZONE_REPORTING_OPTION_NON_SEQWRITE,
+ ZBC_ZONE_REPORTING_OPTION_NON_WP = 0x3f,
+};
+
+#define ZBC_REPORT_ZONE_PARTIAL 0x80
+
#endif /* _SCSI_PROTO_H_ */
--
2.7.4
Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice & Disclaimer:
This e-mail and any files transmitted with it may contain confidential or legally privileged information of WDC and/or its affiliates, and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system.
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 8/9] sd: Implement support for ZBC devices
@ 2016-09-19 21:27 ` Damien Le Moal
0 siblings, 0 replies; 36+ messages in thread
From: Damien Le Moal @ 2016-09-19 21:27 UTC (permalink / raw)
To: linux-scsi, linux-block
Cc: martin.petersen, axboe, hare, shaun.tancheff, Hannes Reinecke,
Damien Le Moal
From: Hannes Reinecke <hare@suse.com>
Implement ZBC support functions to setup zoned disks and fill the
block device zone information tree during the device scan. The
zone information tree is also always updated on disk revalidation.
This adds support for the REQ_OP_ZONE* operations and also implements
the new RESET_WP provisioning mode so that discard requests can be
mapped to the RESET WRITE POINTER command for devices with a constant
zone size.
The capacity read of the device triggers the zone information read
for zoned block devices. As this needs the device zone model, the
the call to sd_read_capacity is moved after the call to
sd_read_block_characteristics so that host-aware devices are
properlly initialized. The call to sd_zbc_read_zones in
sd_read_capacity may change the device capacity obtained with
the sd_read_capacity_16 function for devices reporting only the
capacity of conventional zones at the beginning of the LBA range
(i.e. devices with rc_basis et to 0).
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
---
drivers/scsi/Makefile | 1 +
drivers/scsi/sd.c | 147 ++++--
drivers/scsi/sd.h | 68 +++
drivers/scsi/sd_zbc.c | 1097 +++++++++++++++++++++++++++++++++++++++++++++
include/scsi/scsi_proto.h | 17 +
5 files changed, 1304 insertions(+), 26 deletions(-)
create mode 100644 drivers/scsi/sd_zbc.c
diff --git a/drivers/scsi/Makefile b/drivers/scsi/Makefile
index d539798..fabcb6d 100644
--- a/drivers/scsi/Makefile
+++ b/drivers/scsi/Makefile
@@ -179,6 +179,7 @@ hv_storvsc-y := storvsc_drv.o
sd_mod-objs := sd.o
sd_mod-$(CONFIG_BLK_DEV_INTEGRITY) += sd_dif.o
+sd_mod-$(CONFIG_BLK_DEV_ZONED) += sd_zbc.o
sr_mod-objs := sr.o sr_ioctl.o sr_vendor.o
ncr53c8xx-flags-$(CONFIG_SCSI_ZALON) \
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index d3e852a..46b8b78 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -92,6 +92,7 @@ MODULE_ALIAS_BLOCKDEV_MAJOR(SCSI_DISK15_MAJOR);
MODULE_ALIAS_SCSI_DEVICE(TYPE_DISK);
MODULE_ALIAS_SCSI_DEVICE(TYPE_MOD);
MODULE_ALIAS_SCSI_DEVICE(TYPE_RBC);
+MODULE_ALIAS_SCSI_DEVICE(TYPE_ZBC);
#if !defined(CONFIG_DEBUG_BLOCK_EXT_DEVT)
#define SD_MINORS 16
@@ -99,7 +100,6 @@ MODULE_ALIAS_SCSI_DEVICE(TYPE_RBC);
#define SD_MINORS 0
#endif
-static void sd_config_discard(struct scsi_disk *, unsigned int);
static void sd_config_write_same(struct scsi_disk *);
static int sd_revalidate_disk(struct gendisk *);
static void sd_unlock_native_capacity(struct gendisk *disk);
@@ -162,7 +162,7 @@ cache_type_store(struct device *dev, struct device_attribute *attr,
static const char temp[] = "temporary ";
int len;
- if (sdp->type != TYPE_DISK)
+ if (sdp->type != TYPE_DISK && sdp->type != TYPE_ZBC)
/* no cache control on RBC devices; theoretically they
* can do it, but there's probably so many exceptions
* it's not worth the risk */
@@ -261,7 +261,7 @@ allow_restart_store(struct device *dev, struct device_attribute *attr,
if (!capable(CAP_SYS_ADMIN))
return -EACCES;
- if (sdp->type != TYPE_DISK)
+ if (sdp->type != TYPE_DISK && sdp->type != TYPE_ZBC)
return -EINVAL;
sdp->allow_restart = simple_strtoul(buf, NULL, 10);
@@ -369,6 +369,7 @@ static const char *lbp_mode[] = {
[SD_LBP_WS16] = "writesame_16",
[SD_LBP_WS10] = "writesame_10",
[SD_LBP_ZERO] = "writesame_zero",
+ [SD_ZBC_RESET_WP] = "reset_wp",
[SD_LBP_DISABLE] = "disabled",
};
@@ -391,6 +392,13 @@ provisioning_mode_store(struct device *dev, struct device_attribute *attr,
if (!capable(CAP_SYS_ADMIN))
return -EACCES;
+ if (sdkp->zoned == 1 || sdp->type == TYPE_ZBC) {
+ if (!strncmp(buf, lbp_mode[SD_ZBC_RESET_WP], 20)) {
+ sd_config_discard(sdkp, SD_ZBC_RESET_WP);
+ return count;
+ }
+ return -EINVAL;
+ }
if (sdp->type != TYPE_DISK)
return -EINVAL;
@@ -458,7 +466,7 @@ max_write_same_blocks_store(struct device *dev, struct device_attribute *attr,
if (!capable(CAP_SYS_ADMIN))
return -EACCES;
- if (sdp->type != TYPE_DISK)
+ if (sdp->type != TYPE_DISK && sdp->type != TYPE_ZBC)
return -EINVAL;
err = kstrtoul(buf, 10, &max);
@@ -631,7 +639,7 @@ static unsigned char sd_setup_protect_cmnd(struct scsi_cmnd *scmd,
return protect;
}
-static void sd_config_discard(struct scsi_disk *sdkp, unsigned int mode)
+void sd_config_discard(struct scsi_disk *sdkp, unsigned int mode)
{
struct request_queue *q = sdkp->disk->queue;
unsigned int logical_block_size = sdkp->device->sector_size;
@@ -683,6 +691,11 @@ static void sd_config_discard(struct scsi_disk *sdkp, unsigned int mode)
q->limits.discard_zeroes_data = sdkp->lbprz;
break;
+ case SD_ZBC_RESET_WP:
+ max_blocks = min_not_zero(sdkp->max_unmap_blocks,
+ (u32)SD_MAX_WS16_BLOCKS);
+ break;
+
case SD_LBP_ZERO:
max_blocks = min_not_zero(sdkp->max_ws_blocks,
(u32)SD_MAX_WS10_BLOCKS);
@@ -711,16 +724,20 @@ static int sd_setup_discard_cmnd(struct scsi_cmnd *cmd)
unsigned int nr_sectors = blk_rq_sectors(rq);
unsigned int nr_bytes = blk_rq_bytes(rq);
unsigned int len;
- int ret;
+ int ret = BLKPREP_OK;
char *buf;
- struct page *page;
+ struct page *page = NULL;
sector >>= ilog2(sdp->sector_size) - 9;
nr_sectors >>= ilog2(sdp->sector_size) - 9;
- page = alloc_page(GFP_ATOMIC | __GFP_ZERO);
- if (!page)
- return BLKPREP_DEFER;
+ if (sdkp->provisioning_mode != SD_ZBC_RESET_WP) {
+ page = alloc_page(GFP_ATOMIC | __GFP_ZERO);
+ if (!page)
+ return BLKPREP_DEFER;
+ }
+
+ rq->completion_data = page;
switch (sdkp->provisioning_mode) {
case SD_LBP_UNMAP:
@@ -760,12 +777,19 @@ static int sd_setup_discard_cmnd(struct scsi_cmnd *cmd)
len = sdkp->device->sector_size;
break;
+ case SD_ZBC_RESET_WP:
+ ret = sd_zbc_setup_reset_cmnd(cmd);
+ if (ret != BLKPREP_OK)
+ goto out;
+ /* Reset Write Pointer doesn't have a payload */
+ len = 0;
+ break;
+
default:
ret = BLKPREP_INVALID;
goto out;
}
- rq->completion_data = page;
rq->timeout = SD_TIMEOUT;
cmd->transfersize = len;
@@ -779,13 +803,17 @@ static int sd_setup_discard_cmnd(struct scsi_cmnd *cmd)
* discarded on disk. This allows us to report completion on the full
* amount of blocks described by the request.
*/
- blk_add_request_payload(rq, page, 0, len);
- ret = scsi_init_io(cmd);
+ if (len) {
+ blk_add_request_payload(rq, page, 0, len);
+ ret = scsi_init_io(cmd);
+ }
rq->__data_len = nr_bytes;
out:
- if (ret != BLKPREP_OK)
+ if (page && ret != BLKPREP_OK) {
+ rq->completion_data = NULL;
__free_page(page);
+ }
return ret;
}
@@ -843,6 +871,13 @@ static int sd_setup_write_same_cmnd(struct scsi_cmnd *cmd)
BUG_ON(bio_offset(bio) || bio_iovec(bio).bv_len != sdp->sector_size);
+ if (sdkp->zoned == 1 || sdp->type == TYPE_ZBC) {
+ /* sd_zbc_setup_read_write uses block layer sector units */
+ ret = sd_zbc_setup_read_write(sdkp, rq, sector, &nr_sectors);
+ if (ret != BLKPREP_OK)
+ return ret;
+ }
+
sector >>= ilog2(sdp->sector_size) - 9;
nr_sectors >>= ilog2(sdp->sector_size) - 9;
@@ -962,6 +997,13 @@ static int sd_setup_read_write_cmnd(struct scsi_cmnd *SCpnt)
SCSI_LOG_HLQUEUE(2, scmd_printk(KERN_INFO, SCpnt, "block=%llu\n",
(unsigned long long)block));
+ if (sdkp->zoned == 1 || sdp->type == TYPE_ZBC) {
+ /* sd_zbc_setup_read_write uses block layer sector units */
+ ret = sd_zbc_setup_read_write(sdkp, rq, block, &this_count);
+ if (ret != BLKPREP_OK)
+ goto out;
+ }
+
/*
* If we have a 1K hardware sectorsize, prevent access to single
* 512 byte sectors. In theory we could handle this - in fact
@@ -1148,6 +1190,16 @@ static int sd_init_command(struct scsi_cmnd *cmd)
case REQ_OP_READ:
case REQ_OP_WRITE:
return sd_setup_read_write_cmnd(cmd);
+ case REQ_OP_ZONE_REPORT:
+ return sd_zbc_setup_report_cmnd(cmd);
+ case REQ_OP_ZONE_RESET:
+ return sd_zbc_setup_reset_cmnd(cmd);
+ case REQ_OP_ZONE_OPEN:
+ return sd_zbc_setup_open_cmnd(cmd);
+ case REQ_OP_ZONE_CLOSE:
+ return sd_zbc_setup_close_cmnd(cmd);
+ case REQ_OP_ZONE_FINISH:
+ return sd_zbc_setup_finish_cmnd(cmd);
default:
BUG();
}
@@ -1157,7 +1209,8 @@ static void sd_uninit_command(struct scsi_cmnd *SCpnt)
{
struct request *rq = SCpnt->request;
- if (req_op(rq) == REQ_OP_DISCARD)
+ if (req_op(rq) == REQ_OP_DISCARD &&
+ rq->completion_data)
__free_page(rq->completion_data);
if (SCpnt->cmnd != rq->cmd) {
@@ -1778,8 +1831,16 @@ static int sd_done(struct scsi_cmnd *SCpnt)
int sense_deferred = 0;
unsigned char op = SCpnt->cmnd[0];
unsigned char unmap = SCpnt->cmnd[1] & 8;
+ unsigned char sa = SCpnt->cmnd[1] & 0xf;
- if (req_op(req) == REQ_OP_DISCARD || req_op(req) == REQ_OP_WRITE_SAME) {
+ switch(req_op(req)) {
+ case REQ_OP_DISCARD:
+ case REQ_OP_WRITE_SAME:
+ case REQ_OP_ZONE_REPORT:
+ case REQ_OP_ZONE_RESET:
+ case REQ_OP_ZONE_OPEN:
+ case REQ_OP_ZONE_CLOSE:
+ case REQ_OP_ZONE_FINISH:
if (!result) {
good_bytes = blk_rq_bytes(req);
scsi_set_resid(SCpnt, 0);
@@ -1787,6 +1848,7 @@ static int sd_done(struct scsi_cmnd *SCpnt)
good_bytes = 0;
scsi_set_resid(SCpnt, blk_rq_bytes(req));
}
+ break;
}
if (result) {
@@ -1829,6 +1891,10 @@ static int sd_done(struct scsi_cmnd *SCpnt)
case UNMAP:
sd_config_discard(sdkp, SD_LBP_DISABLE);
break;
+ case ZBC_OUT:
+ if (sa == ZO_RESET_WRITE_POINTER)
+ sd_config_discard(sdkp, SD_LBP_DISABLE);
+ break;
case WRITE_SAME_16:
case WRITE_SAME:
if (unmap)
@@ -1847,7 +1913,11 @@ static int sd_done(struct scsi_cmnd *SCpnt)
default:
break;
}
+
out:
+ if (sdkp->zoned == 1 || sdkp->device->type == TYPE_ZBC)
+ sd_zbc_done(SCpnt, &sshdr);
+
SCSI_LOG_HLCOMPLETE(1, scmd_printk(KERN_INFO, SCpnt,
"sd_done: completed %d of %d bytes\n",
good_bytes, scsi_bufflen(SCpnt)));
@@ -1982,7 +2052,6 @@ sd_spinup_disk(struct scsi_disk *sdkp)
}
}
-
/*
* Determine whether disk supports Data Integrity Field.
*/
@@ -2132,6 +2201,9 @@ static int read_capacity_16(struct scsi_disk *sdkp, struct scsi_device *sdp,
/* Logical blocks per physical block exponent */
sdkp->physical_block_size = (1 << (buffer[13] & 0xf)) * sector_size;
+ /* RC basis */
+ sdkp->rc_basis = (buffer[12] >> 4) & 0x3;
+
/* Lowest aligned logical block */
alignment = ((buffer[14] & 0x3f) << 8 | buffer[15]) * sector_size;
blk_queue_alignment_offset(sdp->request_queue, alignment);
@@ -2322,6 +2394,11 @@ got_data:
sector_size = 512;
}
blk_queue_logical_block_size(sdp->request_queue, sector_size);
+ blk_queue_physical_block_size(sdp->request_queue,
+ sdkp->physical_block_size);
+ sdkp->device->sector_size = sector_size;
+
+ sd_zbc_read_zones(sdkp, buffer);
{
char cap_str_2[10], cap_str_10[10];
@@ -2348,9 +2425,6 @@ got_data:
if (sdkp->capacity > 0xffffffff)
sdp->use_16_for_rw = 1;
- blk_queue_physical_block_size(sdp->request_queue,
- sdkp->physical_block_size);
- sdkp->device->sector_size = sector_size;
}
/* called with buffer of length 512 */
@@ -2612,7 +2686,7 @@ static void sd_read_app_tag_own(struct scsi_disk *sdkp, unsigned char *buffer)
struct scsi_mode_data data;
struct scsi_sense_hdr sshdr;
- if (sdp->type != TYPE_DISK)
+ if (sdp->type != TYPE_DISK && sdp->type != TYPE_ZBC)
return;
if (sdkp->protection_type == 0)
@@ -2719,6 +2793,7 @@ static void sd_read_block_limits(struct scsi_disk *sdkp)
*/
static void sd_read_block_characteristics(struct scsi_disk *sdkp)
{
+ struct request_queue *q = sdkp->disk->queue;
unsigned char *buffer;
u16 rot;
const int vpd_len = 64;
@@ -2733,10 +2808,21 @@ static void sd_read_block_characteristics(struct scsi_disk *sdkp)
rot = get_unaligned_be16(&buffer[4]);
if (rot == 1) {
- queue_flag_set_unlocked(QUEUE_FLAG_NONROT, sdkp->disk->queue);
- queue_flag_clear_unlocked(QUEUE_FLAG_ADD_RANDOM, sdkp->disk->queue);
+ queue_flag_set_unlocked(QUEUE_FLAG_NONROT, q);
+ queue_flag_clear_unlocked(QUEUE_FLAG_ADD_RANDOM, q);
}
+ sdkp->zoned = (buffer[8] >> 4) & 3;
+ if (sdkp->zoned == 1)
+ q->limits.zoned = BLK_ZONED_HA;
+ else if (sdkp->device->type == TYPE_ZBC)
+ q->limits.zoned = BLK_ZONED_HM;
+ else
+ q->limits.zoned = BLK_ZONED_NONE;
+ if (blk_queue_zoned(q) && sdkp->first_scan)
+ sd_printk(KERN_NOTICE, sdkp, "Host-%s zoned block device\n",
+ q->limits.zoned == BLK_ZONED_HM ? "managed" : "aware");
+
out:
kfree(buffer);
}
@@ -2835,14 +2921,14 @@ static int sd_revalidate_disk(struct gendisk *disk)
* react badly if we do.
*/
if (sdkp->media_present) {
- sd_read_capacity(sdkp, buffer);
-
if (scsi_device_supports_vpd(sdp)) {
sd_read_block_provisioning(sdkp);
sd_read_block_limits(sdkp);
sd_read_block_characteristics(sdkp);
}
+ sd_read_capacity(sdkp, buffer);
+
sd_read_write_protect_flag(sdkp, buffer);
sd_read_cache_type(sdkp, buffer);
sd_read_app_tag_own(sdkp, buffer);
@@ -3040,9 +3126,16 @@ static int sd_probe(struct device *dev)
scsi_autopm_get_device(sdp);
error = -ENODEV;
- if (sdp->type != TYPE_DISK && sdp->type != TYPE_MOD && sdp->type != TYPE_RBC)
+ if (sdp->type != TYPE_DISK &&
+ sdp->type != TYPE_ZBC &&
+ sdp->type != TYPE_MOD &&
+ sdp->type != TYPE_RBC)
goto out;
+#ifndef CONFIG_BLK_DEV_ZONED
+ if (sdp->type == TYPE_ZBC)
+ goto out;
+#endif
SCSI_LOG_HLQUEUE(3, sdev_printk(KERN_INFO, sdp,
"sd_probe\n"));
@@ -3146,6 +3239,8 @@ static int sd_remove(struct device *dev)
del_gendisk(sdkp->disk);
sd_shutdown(dev);
+ sd_zbc_remove(sdkp);
+
blk_register_region(devt, SD_MINORS, NULL,
sd_default_probe, NULL, NULL);
diff --git a/drivers/scsi/sd.h b/drivers/scsi/sd.h
index 765a6f1..3452871 100644
--- a/drivers/scsi/sd.h
+++ b/drivers/scsi/sd.h
@@ -56,6 +56,7 @@ enum {
SD_LBP_WS16, /* Use WRITE SAME(16) with UNMAP bit */
SD_LBP_WS10, /* Use WRITE SAME(10) with UNMAP bit */
SD_LBP_ZERO, /* Use WRITE SAME(10) with zero payload */
+ SD_ZBC_RESET_WP, /* Use RESET WRITE POINTER */
SD_LBP_DISABLE, /* Discard disabled due to failed cmd */
};
@@ -64,6 +65,11 @@ struct scsi_disk {
struct scsi_device *device;
struct device dev;
struct gendisk *disk;
+#ifdef CONFIG_BLK_DEV_ZONED
+ struct workqueue_struct *zone_work_q;
+ sector_t zone_sectors;
+ unsigned int nr_zones;
+#endif
atomic_t openers;
sector_t capacity; /* size in logical blocks */
u32 max_xfer_blocks;
@@ -94,6 +100,8 @@ struct scsi_disk {
unsigned lbpvpd : 1;
unsigned ws10 : 1;
unsigned ws16 : 1;
+ unsigned rc_basis: 2;
+ unsigned zoned: 2;
};
#define to_scsi_disk(obj) container_of(obj,struct scsi_disk,dev)
@@ -156,6 +164,13 @@ static inline unsigned int logical_to_bytes(struct scsi_device *sdev, sector_t b
return blocks * sdev->sector_size;
}
+static inline sector_t sectors_to_logical(struct scsi_device *sdev, sector_t sector)
+{
+ return sector >> (ilog2(sdev->sector_size) - 9);
+}
+
+extern void sd_config_discard(struct scsi_disk *, unsigned int);
+
/*
* A DIF-capable target device can be formatted with different
* protection schemes. Currently 0 through 3 are defined:
@@ -269,4 +284,57 @@ static inline void sd_dif_complete(struct scsi_cmnd *cmd, unsigned int a)
#endif /* CONFIG_BLK_DEV_INTEGRITY */
+#ifdef CONFIG_BLK_DEV_ZONED
+
+extern void sd_zbc_read_zones(struct scsi_disk *, char *);
+extern void sd_zbc_remove(struct scsi_disk *);
+extern int sd_zbc_setup_read_write(struct scsi_disk *, struct request *,
+ sector_t, unsigned int *);
+extern int sd_zbc_setup_report_cmnd(struct scsi_cmnd *);
+extern int sd_zbc_setup_reset_cmnd(struct scsi_cmnd *);
+extern int sd_zbc_setup_open_cmnd(struct scsi_cmnd *);
+extern int sd_zbc_setup_close_cmnd(struct scsi_cmnd *);
+extern int sd_zbc_setup_finish_cmnd(struct scsi_cmnd *);
+extern void sd_zbc_done(struct scsi_cmnd *, struct scsi_sense_hdr *);
+
+#else /* CONFIG_BLK_DEV_ZONED */
+
+static inline void sd_zbc_read_zones(struct scsi_disk *sdkp,
+ unsigned char *buf) {}
+static inline void sd_zbc_remove(struct scsi_disk *sdkp) {}
+
+static inline int sd_zbc_setup_read_write(struct scsi_disk *sdkp,
+ struct request *rq, sector_t sector,
+ unsigned int *num_sectors)
+{
+ /* Let the drive fail requests */
+ return BLKPREP_OK;
+}
+
+static inline int sd_zbc_setup_report_cmnd(struct scsi_cmnd *cmd)
+{
+ return BLKPREP_KILL;
+}
+static inline int sd_zbc_setup_reset_cmnd(struct scsi_cmnd *cmd)
+{
+ return BLKPREP_KILL;
+}
+static inline int sd_zbc_setup_open_cmnd(struct scsi_cmnd *cmd)
+{
+ return BLKPREP_KILL;
+}
+static inline int sd_zbc_setup_close_cmnd(struct scsi_cmnd *cmd)
+{
+ return BLKPREP_KILL;
+}
+static inline int sd_zbc_setup_finish_cmnd(struct scsi_cmnd *cmd)
+{
+ return BLKPREP_KILL;
+}
+
+static inline void sd_zbc_done(struct scsi_cmnd *cmd,
+ struct scsi_sense_hdr *sshdr) {}
+
+#endif /* CONFIG_BLK_DEV_ZONED */
+
#endif /* _SCSI_DISK_H */
diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
new file mode 100644
index 0000000..ec9c3fc
--- /dev/null
+++ b/drivers/scsi/sd_zbc.c
@@ -0,0 +1,1097 @@
+/*
+ * SCSI Zoned Block commands
+ *
+ * Copyright (C) 2014-2015 SUSE Linux GmbH
+ * Written by: Hannes Reinecke <hare@suse.de>
+ * Modified by: Damien Le Moal <damien.lemoal@hgst.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License version
+ * 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; see the file COPYING. If not, write to
+ * the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139,
+ * USA.
+ *
+ */
+
+#include <linux/blkdev.h>
+#include <linux/rbtree.h>
+
+#include <asm/unaligned.h>
+
+#include <scsi/scsi.h>
+#include <scsi/scsi_cmnd.h>
+#include <scsi/scsi_dbg.h>
+#include <scsi/scsi_device.h>
+#include <scsi/scsi_driver.h>
+#include <scsi/scsi_host.h>
+#include <scsi/scsi_eh.h>
+
+#include "sd.h"
+#include "scsi_priv.h"
+
+enum zbc_zone_type {
+ ZBC_ZONE_TYPE_CONV = 0x1,
+ ZBC_ZONE_TYPE_SEQWRITE_REQ,
+ ZBC_ZONE_TYPE_SEQWRITE_PREF,
+ ZBC_ZONE_TYPE_RESERVED,
+};
+
+enum zbc_zone_cond {
+ ZBC_ZONE_COND_NO_WP,
+ ZBC_ZONE_COND_EMPTY,
+ ZBC_ZONE_COND_IMP_OPEN,
+ ZBC_ZONE_COND_EXP_OPEN,
+ ZBC_ZONE_COND_CLOSED,
+ ZBC_ZONE_COND_READONLY = 0xd,
+ ZBC_ZONE_COND_FULL,
+ ZBC_ZONE_COND_OFFLINE,
+};
+
+#define SD_ZBC_BUF_SIZE 131072
+
+#define sd_zbc_debug(sdkp, fmt, args...) \
+ pr_debug("%s %s [%s]: " fmt, \
+ dev_driver_string(&(sdkp)->device->sdev_gendev), \
+ dev_name(&(sdkp)->device->sdev_gendev), \
+ (sdkp)->disk->disk_name, ## args)
+
+#define sd_zbc_debug_ratelimit(sdkp, fmt, args...) \
+ do { \
+ if (printk_ratelimit()) \
+ sd_zbc_debug(sdkp, fmt, ## args); \
+ } while( 0 )
+
+#define sd_zbc_err(sdkp, fmt, args...) \
+ pr_err("%s %s [%s]: " fmt, \
+ dev_driver_string(&(sdkp)->device->sdev_gendev), \
+ dev_name(&(sdkp)->device->sdev_gendev), \
+ (sdkp)->disk->disk_name, ## args)
+
+struct zbc_zone_work {
+ struct work_struct zone_work;
+ struct scsi_disk *sdkp;
+ sector_t sector;
+ sector_t nr_sects;
+ bool init;
+ unsigned int nr_zones;
+};
+
+struct blk_zone *zbc_desc_to_zone(struct scsi_disk *sdkp, unsigned char *rec)
+{
+ struct blk_zone *zone;
+
+ zone = kzalloc(sizeof(struct blk_zone), GFP_KERNEL);
+ if (!zone)
+ return NULL;
+
+ /* Zone type */
+ switch(rec[0] & 0x0f) {
+ case ZBC_ZONE_TYPE_CONV:
+ case ZBC_ZONE_TYPE_SEQWRITE_REQ:
+ case ZBC_ZONE_TYPE_SEQWRITE_PREF:
+ zone->type = rec[0] & 0x0f;
+ break;
+ default:
+ zone->type = BLK_ZONE_TYPE_UNKNOWN;
+ break;
+ }
+
+ /* Zone condition */
+ zone->cond = (rec[1] >> 4) & 0xf;
+ if (rec[1] & 0x01)
+ zone->reset = 1;
+ if (rec[1] & 0x02)
+ zone->non_seq = 1;
+
+ /* Zone start sector and length */
+ zone->len = logical_to_sectors(sdkp->device,
+ get_unaligned_be64(&rec[8]));
+ zone->start = logical_to_sectors(sdkp->device,
+ get_unaligned_be64(&rec[16]));
+
+ /* Zone write pointer */
+ if (blk_zone_is_empty(zone) &&
+ zone->wp != zone->start)
+ zone->wp = zone->start;
+ else if (blk_zone_is_full(zone))
+ zone->wp = zone->start + zone->len;
+ else if (blk_zone_is_seq(zone))
+ zone->wp = logical_to_sectors(sdkp->device,
+ get_unaligned_be64(&rec[24]));
+ else
+ zone->wp = (sector_t)-1;
+
+ return zone;
+}
+
+static int zbc_parse_zones(struct scsi_disk *sdkp, unsigned char *buf,
+ unsigned int buf_len, sector_t *next_sector)
+{
+ struct request_queue *q = sdkp->disk->queue;
+ sector_t capacity = logical_to_sectors(sdkp->device, sdkp->capacity);
+ unsigned char *rec = buf;
+ unsigned int zone_len, list_length;
+
+ /* Parse REPORT ZONES header */
+ list_length = get_unaligned_be32(&buf[0]);
+ rec = buf + 64;
+ list_length += 64;
+
+ if (list_length < buf_len)
+ buf_len = list_length;
+
+ /* Parse REPORT ZONES zone descriptors */
+ *next_sector = capacity;
+ while (rec < buf + buf_len) {
+
+ struct blk_zone *new, *old;
+
+ new = zbc_desc_to_zone(sdkp, rec);
+ if (!new)
+ return -ENOMEM;
+
+ zone_len = new->len;
+ *next_sector = new->start + zone_len;
+
+ old = blk_insert_zone(q, new);
+ if (old) {
+ blk_lock_zone(old);
+
+ /*
+ * Always update the zone state flags and the zone
+ * offline and read-only condition as the drive may
+ * change those independently of the commands being
+ * executed
+ */
+ old->reset = new->reset;
+ old->non_seq = new->non_seq;
+ if (blk_zone_is_offline(new) ||
+ blk_zone_is_readonly(new))
+ old->cond = new->cond;
+
+ if (blk_zone_in_update(old)) {
+ old->cond = new->cond;
+ old->wp = new->wp;
+ blk_clear_zone_update(old);
+ }
+
+ blk_unlock_zone(old);
+
+ kfree(new);
+ }
+
+ rec += 64;
+
+ }
+
+ return 0;
+}
+
+/**
+ * sd_zbc_report_zones - Issue a REPORT ZONES scsi command
+ * @sdkp: SCSI disk to which the command should be send
+ * @buffer: response buffer
+ * @bufflen: length of @buffer
+ * @start_sector: logical sector for the zone information should be reported
+ * @option: reporting option to be used
+ * @partial: flag to set the 'partial' bit for report zones command
+ */
+int sd_zbc_report_zones(struct scsi_disk *sdkp, unsigned char *buffer,
+ int bufflen, sector_t start_sector,
+ enum zbc_zone_reporting_options option, bool partial)
+{
+ struct scsi_device *sdp = sdkp->device;
+ const int timeout = sdp->request_queue->rq_timeout;
+ struct scsi_sense_hdr sshdr;
+ sector_t start_lba = sectors_to_logical(sdkp->device, start_sector);
+ unsigned char cmd[16];
+ int result;
+
+ if (!scsi_device_online(sdp))
+ return -ENODEV;
+
+ sd_zbc_debug(sdkp, "REPORT ZONES lba %zu len %d\n",
+ start_lba, bufflen);
+
+ memset(cmd, 0, 16);
+ cmd[0] = ZBC_IN;
+ cmd[1] = ZI_REPORT_ZONES;
+ put_unaligned_be64(start_lba, &cmd[2]);
+ put_unaligned_be32(bufflen, &cmd[10]);
+ cmd[14] = (partial ? ZBC_REPORT_ZONE_PARTIAL : 0) | option;
+ memset(buffer, 0, bufflen);
+
+ result = scsi_execute_req(sdp, cmd, DMA_FROM_DEVICE,
+ buffer, bufflen, &sshdr,
+ timeout, SD_MAX_RETRIES, NULL);
+
+ if (result) {
+ sd_zbc_err(sdkp,
+ "REPORT ZONES lba %zu failed with %d/%d\n",
+ start_lba, host_byte(result), driver_byte(result));
+ return -EIO;
+ }
+
+ return 0;
+}
+
+/**
+ * Set or clear the update flag of all zones contained
+ * in the range sector..sector+nr_sects.
+ * Return the number of zones marked/cleared.
+ */
+static int __sd_zbc_zones_updating(struct scsi_disk *sdkp,
+ sector_t sector, sector_t nr_sects,
+ bool set)
+{
+ struct request_queue *q = sdkp->disk->queue;
+ struct blk_zone *zone;
+ struct rb_node *node;
+ unsigned long flags;
+ int nr_zones = 0;
+
+ if (!nr_sects) {
+ /* All zones */
+ sector = 0;
+ nr_sects = logical_to_sectors(sdkp->device, sdkp->capacity);
+ }
+
+ spin_lock_irqsave(&q->zones_lock, flags);
+ for (node = rb_first(&q->zones); node && nr_sects; node = rb_next(node)) {
+ zone = rb_entry(node, struct blk_zone, node);
+ if (sector < zone->start || sector >= (zone->start + zone->len))
+ continue;
+ if (set) {
+ if (!test_and_set_bit_lock(BLK_ZONE_IN_UPDATE, &zone->flags))
+ nr_zones++;
+ } else if (test_and_clear_bit(BLK_ZONE_IN_UPDATE, &zone->flags)) {
+ wake_up_bit(&zone->flags, BLK_ZONE_IN_UPDATE);
+ nr_zones++;
+ }
+ sector = zone->start + zone->len;
+ if (nr_sects <= zone->len)
+ nr_sects = 0;
+ else
+ nr_sects -= zone->len;
+ }
+ spin_unlock_irqrestore(&q->zones_lock, flags);
+
+ return nr_zones;
+}
+
+static inline int sd_zbc_set_zones_updating(struct scsi_disk *sdkp,
+ sector_t sector, sector_t nr_sects)
+{
+ return __sd_zbc_zones_updating(sdkp, sector, nr_sects, true);
+}
+
+static inline int sd_zbc_clear_zones_updating(struct scsi_disk *sdkp,
+ sector_t sector, sector_t nr_sects)
+{
+ return __sd_zbc_zones_updating(sdkp, sector, nr_sects, false);
+}
+
+static void sd_zbc_start_queue(struct request_queue *q)
+{
+ unsigned long flags;
+
+ if (q->mq_ops) {
+ blk_mq_start_hw_queues(q);
+ } else {
+ spin_lock_irqsave(q->queue_lock, flags);
+ blk_start_queue(q);
+ spin_unlock_irqrestore(q->queue_lock, flags);
+ }
+}
+
+static void sd_zbc_update_zone_work(struct work_struct *work)
+{
+ struct zbc_zone_work *zwork =
+ container_of(work, struct zbc_zone_work, zone_work);
+ struct scsi_disk *sdkp = zwork->sdkp;
+ sector_t capacity = logical_to_sectors(sdkp->device, sdkp->capacity);
+ struct request_queue *q = sdkp->disk->queue;
+ sector_t end_sector, sector = zwork->sector;
+ unsigned int bufsize;
+ unsigned char *buf;
+ int ret = -ENOMEM;
+
+ /* Get a buffer */
+ if (!zwork->nr_zones) {
+ bufsize = SD_ZBC_BUF_SIZE;
+ } else {
+ bufsize = (zwork->nr_zones + 1) * 64;
+ if (bufsize < 512)
+ bufsize = 512;
+ else if (bufsize > SD_ZBC_BUF_SIZE)
+ bufsize = SD_ZBC_BUF_SIZE;
+ else
+ bufsize = (bufsize + 511) & ~511;
+ }
+ buf = kmalloc(bufsize, GFP_KERNEL | GFP_DMA);
+ if (!buf) {
+ sd_zbc_err(sdkp, "Failed to allocate zone report buffer\n");
+ goto done_free;
+ }
+
+ /* Process sector range */
+ end_sector = zwork->sector + zwork->nr_sects;
+ while(sector < min(end_sector, capacity)) {
+
+ /* Get zone report */
+ ret = sd_zbc_report_zones(sdkp, buf, bufsize, sector,
+ ZBC_ZONE_REPORTING_OPTION_ALL, true);
+ if (ret)
+ break;
+
+ ret = zbc_parse_zones(sdkp, buf, bufsize, §or);
+ if (ret)
+ break;
+
+ /* Kick start the queue to allow requests waiting */
+ /* for the zones just updated to run */
+ sd_zbc_start_queue(q);
+
+ }
+
+done_free:
+ if (ret)
+ sd_zbc_clear_zones_updating(sdkp, zwork->sector, zwork->nr_sects);
+ if (buf)
+ kfree(buf);
+ kfree(zwork);
+}
+
+/**
+ * sd_zbc_update_zones - Update zone information for zones starting
+ * from @start_sector. If not in init mode, the update is done only
+ * for zones marked with update flag.
+ * @sdkp: SCSI disk for which the zone information needs to be updated
+ * @start_sector: First sector of the first zone to be updated
+ * @bufsize: buffersize to be allocated for report zones
+ */
+static int sd_zbc_update_zones(struct scsi_disk *sdkp,
+ sector_t sector, sector_t nr_sects,
+ gfp_t gfpflags, bool init)
+{
+ struct zbc_zone_work *zwork;
+
+ zwork = kzalloc(sizeof(struct zbc_zone_work), gfpflags);
+ if (!zwork) {
+ sd_zbc_err(sdkp, "Failed to allocate zone work\n");
+ return -ENOMEM;
+ }
+
+ if (!nr_sects) {
+ /* All zones */
+ sector = 0;
+ nr_sects = logical_to_sectors(sdkp->device, sdkp->capacity);
+ }
+
+ INIT_WORK(&zwork->zone_work, sd_zbc_update_zone_work);
+ zwork->sdkp = sdkp;
+ zwork->sector = sector;
+ zwork->nr_sects = nr_sects;
+ zwork->init = init;
+
+ if (!init)
+ /* Mark the zones falling in the report as updating */
+ zwork->nr_zones = sd_zbc_set_zones_updating(sdkp, sector, nr_sects);
+
+ if (init || zwork->nr_zones)
+ queue_work(sdkp->zone_work_q, &zwork->zone_work);
+ else
+ kfree(zwork);
+
+ return 0;
+}
+
+int sd_zbc_setup_report_cmnd(struct scsi_cmnd *cmd)
+{
+ struct request *rq = cmd->request;
+ struct gendisk *disk = rq->rq_disk;
+ struct scsi_disk *sdkp = scsi_disk(disk);
+ int ret;
+
+ if (!sdkp->zone_work_q)
+ return BLKPREP_KILL;
+
+ ret = sd_zbc_update_zones(sdkp, blk_rq_pos(rq), blk_rq_sectors(rq),
+ GFP_ATOMIC, false);
+ if (unlikely(ret))
+ return BLKPREP_DEFER;
+
+ return BLKPREP_DONE;
+}
+
+static void sd_zbc_setup_action_cmnd(struct scsi_cmnd *cmd,
+ u8 action,
+ bool all)
+{
+ struct request *rq = cmd->request;
+ struct scsi_disk *sdkp = scsi_disk(rq->rq_disk);
+ sector_t lba;
+
+ cmd->cmd_len = 16;
+ cmd->cmnd[0] = ZBC_OUT;
+ cmd->cmnd[1] = action;
+ if (all) {
+ cmd->cmnd[14] |= 0x01;
+ } else {
+ lba = sectors_to_logical(sdkp->device, blk_rq_pos(rq));
+ put_unaligned_be64(lba, &cmd->cmnd[2]);
+ }
+
+ rq->completion_data = NULL;
+ rq->timeout = SD_TIMEOUT;
+ rq->__data_len = blk_rq_bytes(rq);
+
+ /* Don't retry */
+ cmd->allowed = 0;
+ cmd->transfersize = 0;
+ cmd->sc_data_direction = DMA_NONE;
+}
+
+int sd_zbc_setup_reset_cmnd(struct scsi_cmnd *cmd)
+{
+ struct request *rq = cmd->request;
+ struct scsi_disk *sdkp = scsi_disk(rq->rq_disk);
+ sector_t sector = blk_rq_pos(rq);
+ sector_t nr_sects = blk_rq_sectors(rq);
+ struct blk_zone *zone = NULL;
+ int ret = BLKPREP_OK;
+
+ if (nr_sects) {
+ zone = blk_lookup_zone(rq->q, sector);
+ if (!zone)
+ return BLKPREP_KILL;
+ }
+
+ if (zone) {
+
+ blk_lock_zone(zone);
+
+ /* If the zone is being updated, wait */
+ if (blk_zone_in_update(zone)) {
+ ret = BLKPREP_DEFER;
+ goto out;
+ }
+
+ if (zone->type == BLK_ZONE_TYPE_UNKNOWN) {
+ sd_zbc_debug(sdkp,
+ "Discarding unknown zone %zu\n",
+ zone->start);
+ ret = BLKPREP_KILL;
+ goto out;
+ }
+
+ /* Nothing to do for conventional sequential zones */
+ if (blk_zone_is_conv(zone)) {
+ ret = BLKPREP_DONE;
+ goto out;
+ }
+
+ if (!blk_try_write_lock_zone(zone)) {
+ ret = BLKPREP_DEFER;
+ goto out;
+ }
+
+ /* Nothing to do if the zone is already empty */
+ if (blk_zone_is_empty(zone)) {
+ blk_write_unlock_zone(zone);
+ ret = BLKPREP_DONE;
+ goto out;
+ }
+
+ if (sector != zone->start ||
+ (nr_sects != zone->len)) {
+ sd_printk(KERN_ERR, sdkp,
+ "Unaligned reset wp request, start %zu/%zu"
+ " len %zu/%zu\n",
+ zone->start, sector, zone->len, nr_sects);
+ blk_write_unlock_zone(zone);
+ ret = BLKPREP_KILL;
+ goto out;
+ }
+
+ }
+
+ sd_zbc_setup_action_cmnd(cmd, ZO_RESET_WRITE_POINTER, !zone);
+
+out:
+ if (zone) {
+ if (ret == BLKPREP_OK) {
+ /*
+ * Opportunistic update. Will be fixed up
+ * with zone update if the command fails,
+ */
+ zone->wp = zone->start;
+ zone->cond = BLK_ZONE_COND_EMPTY;
+ zone->reset = 0;
+ zone->non_seq = 0;
+ }
+ blk_unlock_zone(zone);
+ }
+
+ return ret;
+}
+
+int sd_zbc_setup_open_cmnd(struct scsi_cmnd *cmd)
+{
+ struct request *rq = cmd->request;
+ struct scsi_disk *sdkp = scsi_disk(rq->rq_disk);
+ sector_t sector = blk_rq_pos(rq);
+ sector_t nr_sects = blk_rq_sectors(rq);
+ struct blk_zone *zone = NULL;
+ int ret = BLKPREP_OK;
+
+ if (nr_sects) {
+ zone = blk_lookup_zone(rq->q, sector);
+ if (!zone)
+ return BLKPREP_KILL;
+ }
+
+ if (zone) {
+
+ blk_lock_zone(zone);
+
+ /* If the zone is being updated, wait */
+ if (blk_zone_in_update(zone)) {
+ ret = BLKPREP_DEFER;
+ goto out;
+ }
+
+ if (zone->type == BLK_ZONE_TYPE_UNKNOWN) {
+ sd_zbc_debug(sdkp,
+ "Opening unknown zone %zu\n",
+ zone->start);
+ ret = BLKPREP_KILL;
+ goto out;
+ }
+
+ /*
+ * Nothing to do for conventional zones,
+ * zones already open or full zones.
+ */
+ if (blk_zone_is_conv(zone) ||
+ blk_zone_is_open(zone) ||
+ blk_zone_is_full(zone)) {
+ ret = BLKPREP_DONE;
+ goto out;
+ }
+
+ if (sector != zone->start ||
+ (nr_sects != zone->len)) {
+ sd_printk(KERN_ERR, sdkp,
+ "Unaligned open zone request, start %zu/%zu"
+ " len %zu/%zu\n",
+ zone->start, sector, zone->len, nr_sects);
+ ret = BLKPREP_KILL;
+ goto out;
+ }
+
+ }
+
+ sd_zbc_setup_action_cmnd(cmd, ZO_OPEN_ZONE, !zone);
+
+out:
+ if (zone) {
+ if (ret == BLKPREP_OK)
+ /*
+ * Opportunistic update. Will be fixed up
+ * with zone update if the command fails.
+ */
+ zone->cond = BLK_ZONE_COND_EXP_OPEN;
+ blk_unlock_zone(zone);
+ }
+
+ return ret;
+}
+
+int sd_zbc_setup_close_cmnd(struct scsi_cmnd *cmd)
+{
+ struct request *rq = cmd->request;
+ struct scsi_disk *sdkp = scsi_disk(rq->rq_disk);
+ sector_t sector = blk_rq_pos(rq);
+ sector_t nr_sects = blk_rq_sectors(rq);
+ struct blk_zone *zone = NULL;
+ int ret = BLKPREP_OK;
+
+ if (nr_sects) {
+ zone = blk_lookup_zone(rq->q, sector);
+ if (!zone)
+ return BLKPREP_KILL;
+ }
+
+ if (zone) {
+
+ blk_lock_zone(zone);
+
+ /* If the zone is being updated, wait */
+ if (blk_zone_in_update(zone)) {
+ ret = BLKPREP_DEFER;
+ goto out;
+ }
+
+ if (zone->type == BLK_ZONE_TYPE_UNKNOWN) {
+ sd_zbc_debug(sdkp,
+ "Closing unknown zone %zu\n",
+ zone->start);
+ ret = BLKPREP_KILL;
+ goto out;
+ }
+
+ /*
+ * Nothing to do for conventional zones,
+ * full zones or empty zones.
+ */
+ if (blk_zone_is_conv(zone) ||
+ blk_zone_is_full(zone) ||
+ blk_zone_is_empty(zone)) {
+ ret = BLKPREP_DONE;
+ goto out;
+ }
+
+ if (sector != zone->start ||
+ (nr_sects != zone->len)) {
+ sd_printk(KERN_ERR, sdkp,
+ "Unaligned close zone request, start %zu/%zu"
+ " len %zu/%zu\n",
+ zone->start, sector, zone->len, nr_sects);
+ ret = BLKPREP_KILL;
+ goto out;
+ }
+
+ }
+
+ sd_zbc_setup_action_cmnd(cmd, ZO_CLOSE_ZONE, !zone);
+
+out:
+ if (zone) {
+ if (ret == BLKPREP_OK)
+ /*
+ * Opportunistic update. Will be fixed up
+ * with zone update if the command fails.
+ */
+ zone->cond = BLK_ZONE_COND_CLOSED;
+ blk_unlock_zone(zone);
+ }
+
+ return ret;
+}
+
+int sd_zbc_setup_finish_cmnd(struct scsi_cmnd *cmd)
+{
+ struct request *rq = cmd->request;
+ struct scsi_disk *sdkp = scsi_disk(rq->rq_disk);
+ sector_t sector = blk_rq_pos(rq);
+ sector_t nr_sects = blk_rq_sectors(rq);
+ struct blk_zone *zone = NULL;
+ int ret = BLKPREP_OK;
+
+ if (nr_sects) {
+ zone = blk_lookup_zone(rq->q, sector);
+ if (!zone)
+ return BLKPREP_KILL;
+ }
+
+ if (zone) {
+
+ blk_lock_zone(zone);
+
+ /* If the zone is being updated, wait */
+ if (blk_zone_in_update(zone)) {
+ ret = BLKPREP_DEFER;
+ goto out;
+ }
+
+ if (zone->type == BLK_ZONE_TYPE_UNKNOWN) {
+ sd_zbc_debug(sdkp,
+ "Finishing unknown zone %zu\n",
+ zone->start);
+ ret = BLKPREP_KILL;
+ goto out;
+ }
+
+ /* Nothing to do for conventional zones and full zones */
+ if (blk_zone_is_conv(zone) ||
+ blk_zone_is_full(zone)) {
+ ret = BLKPREP_DONE;
+ goto out;
+ }
+
+ if (sector != zone->start ||
+ (nr_sects != zone->len)) {
+ sd_printk(KERN_ERR, sdkp,
+ "Unaligned finish zone request, start %zu/%zu"
+ " len %zu/%zu\n",
+ zone->start, sector, zone->len, nr_sects);
+ ret = BLKPREP_KILL;
+ goto out;
+ }
+
+ }
+
+ sd_zbc_setup_action_cmnd(cmd, ZO_FINISH_ZONE, !zone);
+
+out:
+ if (zone) {
+ if (ret == BLKPREP_OK) {
+ /*
+ * Opportunistic update. Will be fixed up
+ * with zone update if the command fails.
+ */
+ zone->cond = BLK_ZONE_COND_FULL;
+ if (blk_zone_is_seq(zone))
+ zone->wp = zone->start + zone->len;
+ }
+ blk_unlock_zone(zone);
+ }
+
+ return ret;
+}
+
+int sd_zbc_setup_read_write(struct scsi_disk *sdkp, struct request *rq,
+ sector_t sector, unsigned int *num_sectors)
+{
+ struct blk_zone *zone;
+ unsigned int sectors = *num_sectors;
+ int ret = BLKPREP_OK;
+
+ zone = blk_lookup_zone(rq->q, sector);
+ if (!zone)
+ /* Let the drive handle the request */
+ return BLKPREP_OK;
+
+ blk_lock_zone(zone);
+
+ /* If the zone is being updated, wait */
+ if (blk_zone_in_update(zone)) {
+ ret = BLKPREP_DEFER;
+ goto out;
+ }
+
+ if (zone->type == BLK_ZONE_TYPE_UNKNOWN) {
+ sd_zbc_debug(sdkp,
+ "Unknown zone %zu\n",
+ zone->start);
+ ret = BLKPREP_KILL;
+ goto out;
+ }
+
+ /* For offline and read-only zones, let the drive fail the command */
+ if (blk_zone_is_offline(zone) ||
+ blk_zone_is_readonly(zone))
+ goto out;
+
+ /* Do not allow zone boundaries crossing */
+ if (sector + sectors > zone->start + zone->len) {
+ ret = BLKPREP_KILL;
+ goto out;
+ }
+
+ /* For conventional zones, no checks */
+ if (blk_zone_is_conv(zone))
+ goto out;
+
+ if (req_op(rq) == REQ_OP_WRITE ||
+ req_op(rq) == REQ_OP_WRITE_SAME) {
+
+ /*
+ * Write requests may change the write pointer and
+ * transition the zone condition to full. Changes
+ * are oportunistic here. If the request fails, a
+ * zone update will fix the zone information.
+ */
+ if (blk_zone_is_seq_req(zone)) {
+
+ /*
+ * Do not issue more than one write at a time per
+ * zone. This solves write ordering problems due to
+ * the unlocking of the request queue in the dispatch
+ * path in the non scsi-mq case. For scsi-mq, this
+ * also avoids potential write reordering when multiple
+ * threads running on different CPUs write to the same
+ * zone (with a synchronized sequential pattern).
+ */
+ if (!blk_try_write_lock_zone(zone)) {
+ ret = BLKPREP_DEFER;
+ goto out;
+ }
+
+ /* For host-managed drives, writes are allowed */
+ /* only at the write pointer position. */
+ if (zone->wp != sector) {
+ blk_write_unlock_zone(zone);
+ ret = BLKPREP_KILL;
+ goto out;
+ }
+
+ zone->wp += sectors;
+ if (zone->wp >= zone->start + zone->len) {
+ zone->cond = BLK_ZONE_COND_FULL;
+ zone->wp = zone->start + zone->len;
+ }
+
+ } else {
+
+ /* For host-aware drives, writes are allowed */
+ /* anywhere in the zone, but wp can only go */
+ /* forward. */
+ sector_t end_sector = sector + sectors;
+ if (sector == zone->wp &&
+ end_sector >= zone->start + zone->len) {
+ zone->cond = BLK_ZONE_COND_FULL;
+ zone->wp = zone->start + zone->len;
+ } else if (end_sector > zone->wp) {
+ zone->wp = end_sector;
+ }
+
+ }
+
+ } else {
+
+ /* Check read after write pointer */
+ if (sector + sectors <= zone->wp)
+ goto out;
+
+ if (zone->wp <= sector) {
+ /* Read beyond WP: clear request buffer */
+ struct req_iterator iter;
+ struct bio_vec bvec;
+ unsigned long flags;
+ void *buf;
+ rq_for_each_segment(bvec, rq, iter) {
+ buf = bvec_kmap_irq(&bvec, &flags);
+ memset(buf, 0, bvec.bv_len);
+ flush_dcache_page(bvec.bv_page);
+ bvec_kunmap_irq(buf, &flags);
+ }
+ ret = BLKPREP_DONE;
+ goto out;
+ }
+
+ /* Read straddle WP position: limit request size */
+ *num_sectors = zone->wp - sector;
+
+ }
+
+out:
+ blk_unlock_zone(zone);
+
+ return ret;
+}
+
+void sd_zbc_done(struct scsi_cmnd *cmd,
+ struct scsi_sense_hdr *sshdr)
+{
+ int result = cmd->result;
+ struct request *rq = cmd->request;
+ struct scsi_disk *sdkp = scsi_disk(rq->rq_disk);
+ struct request_queue *q = sdkp->disk->queue;
+ sector_t pos = blk_rq_pos(rq);
+ struct blk_zone *zone = NULL;
+ bool write_unlock = false;
+
+ /*
+ * Get the target zone of commands of interest. Some may
+ * apply to all zones so check the request sectors first.
+ */
+ switch (req_op(rq)) {
+ case REQ_OP_DISCARD:
+ case REQ_OP_WRITE:
+ case REQ_OP_WRITE_SAME:
+ case REQ_OP_ZONE_RESET:
+ write_unlock = true;
+ /* fallthru */
+ case REQ_OP_ZONE_OPEN:
+ case REQ_OP_ZONE_CLOSE:
+ case REQ_OP_ZONE_FINISH:
+ if (blk_rq_sectors(rq))
+ zone = blk_lookup_zone(q, pos);
+ break;
+ }
+
+ if (zone && write_unlock)
+ blk_write_unlock_zone(zone);
+
+ if (!result)
+ return;
+
+ if (sshdr->sense_key == ILLEGAL_REQUEST &&
+ sshdr->asc == 0x21)
+ /*
+ * It is unlikely that retrying requests failed with any
+ * kind of alignement error will result in success. So don't
+ * try. Report the error back to the user quickly so that
+ * corrective actions can be taken after obtaining updated
+ * zone information.
+ */
+ cmd->allowed = 0;
+
+ /* On error, force an update unless this is a failed report */
+ if (req_op(rq) == REQ_OP_ZONE_REPORT)
+ sd_zbc_clear_zones_updating(sdkp, pos, blk_rq_sectors(rq));
+ else if (zone)
+ sd_zbc_update_zones(sdkp, zone->start, zone->len,
+ GFP_ATOMIC, false);
+}
+
+void sd_zbc_read_zones(struct scsi_disk *sdkp, char *buf)
+{
+ struct request_queue *q = sdkp->disk->queue;
+ struct blk_zone *zone;
+ sector_t capacity;
+ sector_t sector;
+ bool init = false;
+ u32 rep_len;
+ int ret = 0;
+
+ if (sdkp->zoned != 1 && sdkp->device->type != TYPE_ZBC)
+ /*
+ * Device managed or normal SCSI disk,
+ * no special handling required
+ */
+ return;
+
+ /* Do a report zone to get the maximum LBA to check capacity */
+ ret = sd_zbc_report_zones(sdkp, buf, SD_BUF_SIZE,
+ 0, ZBC_ZONE_REPORTING_OPTION_ALL, false);
+ if (ret < 0)
+ return;
+
+ rep_len = get_unaligned_be32(&buf[0]);
+ if (rep_len < 64) {
+ sd_printk(KERN_WARNING, sdkp,
+ "REPORT ZONES report invalid length %u\n",
+ rep_len);
+ return;
+ }
+
+ if (sdkp->rc_basis == 0) {
+ /* The max_lba field is the capacity of this device */
+ sector_t lba = get_unaligned_be64(&buf[8]);
+ if (lba + 1 > sdkp->capacity) {
+ if (sdkp->first_scan)
+ sd_printk(KERN_WARNING, sdkp,
+ "Changing capacity from %zu "
+ "to max LBA+1 %zu\n",
+ sdkp->capacity,
+ (sector_t) lba + 1);
+ sdkp->capacity = lba + 1;
+ }
+ }
+
+ /* Setup the zone work queue */
+ if (! sdkp->zone_work_q) {
+ sdkp->zone_work_q =
+ alloc_ordered_workqueue("zbc_wq_%s", WQ_MEM_RECLAIM,
+ sdkp->disk->disk_name);
+ if (!sdkp->zone_work_q) {
+ sdev_printk(KERN_WARNING, sdkp->device,
+ "Create zoned disk workqueue failed\n");
+ return;
+ }
+ init = true;
+ }
+
+ /*
+ * Parse what we already got. If all zones are not parsed yet,
+ * kick start an update to get the remaining.
+ */
+ capacity = logical_to_sectors(sdkp->device, sdkp->capacity);
+ ret = zbc_parse_zones(sdkp, buf, SD_BUF_SIZE, §or);
+ if (ret == 0 && sector < capacity) {
+ sd_zbc_update_zones(sdkp, sector, capacity - sector,
+ GFP_KERNEL, init);
+ drain_workqueue(sdkp->zone_work_q);
+ }
+ if (ret)
+ return;
+
+ /*
+ * Analyze the zones layout: if all zones are the same size and
+ * the size is a power of 2, chunk the device and map discard to
+ * reset write pointer command. Otherwise, disable discard.
+ */
+ sdkp->zone_sectors = 0;
+ sdkp->nr_zones = 0;
+ sector = 0;
+ while(sector < capacity) {
+
+ zone = blk_lookup_zone(q, sector);
+ if (!zone) {
+ sdkp->zone_sectors = 0;
+ sdkp->nr_zones = 0;
+ break;
+ }
+
+ sector += zone->len;
+
+ if (sdkp->zone_sectors == 0) {
+ sdkp->zone_sectors = zone->len;
+ } else if (sector != capacity &&
+ zone->len != sdkp->zone_sectors) {
+ sdkp->zone_sectors = 0;
+ sdkp->nr_zones = 0;
+ break;
+ }
+
+ sdkp->nr_zones++;
+
+ }
+
+ if (!sdkp->zone_sectors ||
+ !is_power_of_2(sdkp->zone_sectors)) {
+ sd_config_discard(sdkp, SD_LBP_DISABLE);
+ if (sdkp->first_scan)
+ sd_printk(KERN_NOTICE, sdkp,
+ "%u zones (non constant zone size)\n",
+ sdkp->nr_zones);
+ return;
+ }
+
+ /* Setup discard granularity to the zone size */
+ blk_queue_chunk_sectors(sdkp->disk->queue, sdkp->zone_sectors);
+ sdkp->max_unmap_blocks = sdkp->zone_sectors;
+ sdkp->unmap_alignment = sectors_to_logical(sdkp->device,
+ sdkp->zone_sectors);
+ sdkp->unmap_granularity = sdkp->unmap_alignment;
+ sd_config_discard(sdkp, SD_ZBC_RESET_WP);
+
+ if (sdkp->first_scan) {
+ if (sdkp->nr_zones * sdkp->zone_sectors == capacity)
+ sd_printk(KERN_NOTICE, sdkp,
+ "%u zones of %zu sectors\n",
+ sdkp->nr_zones,
+ sdkp->zone_sectors);
+ else
+ sd_printk(KERN_NOTICE, sdkp,
+ "%u zones of %zu sectors "
+ "+ 1 runt zone\n",
+ sdkp->nr_zones - 1,
+ sdkp->zone_sectors);
+ }
+}
+
+void sd_zbc_remove(struct scsi_disk *sdkp)
+{
+
+ sd_config_discard(sdkp, SD_LBP_DISABLE);
+
+ if (sdkp->zone_work_q) {
+ drain_workqueue(sdkp->zone_work_q);
+ destroy_workqueue(sdkp->zone_work_q);
+ sdkp->zone_work_q = NULL;
+ blk_drop_zones(sdkp->disk->queue);
+ }
+}
+
diff --git a/include/scsi/scsi_proto.h b/include/scsi/scsi_proto.h
index d1defd1..6ba66e0 100644
--- a/include/scsi/scsi_proto.h
+++ b/include/scsi/scsi_proto.h
@@ -299,4 +299,21 @@ struct scsi_lun {
#define SCSI_ACCESS_STATE_MASK 0x0f
#define SCSI_ACCESS_STATE_PREFERRED 0x80
+/* Reporting options for REPORT ZONES */
+enum zbc_zone_reporting_options {
+ ZBC_ZONE_REPORTING_OPTION_ALL = 0,
+ ZBC_ZONE_REPORTING_OPTION_EMPTY,
+ ZBC_ZONE_REPORTING_OPTION_IMPLICIT_OPEN,
+ ZBC_ZONE_REPORTING_OPTION_EXPLICIT_OPEN,
+ ZBC_ZONE_REPORTING_OPTION_CLOSED,
+ ZBC_ZONE_REPORTING_OPTION_FULL,
+ ZBC_ZONE_REPORTING_OPTION_READONLY,
+ ZBC_ZONE_REPORTING_OPTION_OFFLINE,
+ ZBC_ZONE_REPORTING_OPTION_NEED_RESET_WP = 0x10,
+ ZBC_ZONE_REPORTING_OPTION_NON_SEQWRITE,
+ ZBC_ZONE_REPORTING_OPTION_NON_WP = 0x3f,
+};
+
+#define ZBC_REPORT_ZONE_PARTIAL 0x80
+
#endif /* _SCSI_PROTO_H_ */
--
2.7.4
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 9/9] blk-zoned: Add ioctl interface for zone operations
2016-09-19 21:27 ` Damien Le Moal
@ 2016-09-19 21:27 ` Damien Le Moal
-1 siblings, 0 replies; 36+ messages in thread
From: Damien Le Moal @ 2016-09-19 21:27 UTC (permalink / raw)
To: linux-scsi, linux-block
Cc: martin.petersen, axboe, hare, shaun.tancheff, Damien Le Moal
From: Shaun Tancheff <shaun.tancheff@seagate.com>
Adds the new BLKUPDATEZONES, BLKREPORTZONE, BLKRESETZONE,
BLKOPENZONE, BLKCLOSEZONE and BLKFINISHZONE ioctls.
BLKREPORTZONE implementation uses the device queue zone RB-tree by
default and no actual command is issued to the device. If the
application needs access to the untracked zone attributes (non-seq
flag or reset recommended flag, offline or read-only zone condition,
etc), BLKUPDATEZONES must be issued first to force an update of the
cached zone information.
Changelog (Damien):
* Simplified blkzone descriptor (removed bit-fields and use CPU
endianness)
* Changed report ioctl to operate on single zone instead of an
array of blkzone structures.
Signed-off-by: Shaun Tancheff <shaun.tancheff@seagate.com>
Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
---
block/blk-zoned.c | 115 ++++++++++++++++++++++++++++++++++++++++++
block/ioctl.c | 8 +++
include/linux/blkdev.h | 7 +++
include/uapi/linux/Kbuild | 1 +
include/uapi/linux/blkzoned.h | 91 +++++++++++++++++++++++++++++++++
include/uapi/linux/fs.h | 1 +
6 files changed, 223 insertions(+)
create mode 100644 include/uapi/linux/blkzoned.h
diff --git a/block/blk-zoned.c b/block/blk-zoned.c
index a107940..71205c8 100644
--- a/block/blk-zoned.c
+++ b/block/blk-zoned.c
@@ -12,6 +12,7 @@
#include <linux/module.h>
#include <linux/rbtree.h>
#include <linux/blkdev.h>
+#include <linux/blkzoned.h>
void blk_init_zones(struct request_queue *q)
{
@@ -336,3 +337,117 @@ int blkdev_finish_zone(struct block_device *bdev,
return blkdev_issue_zone_action(bdev, sector, REQ_OP_ZONE_FINISH,
gfp_mask);
}
+
+static int blkdev_report_zone_ioctl(struct block_device *bdev,
+ void __user *argp)
+{
+ struct blk_zone *zone;
+ struct blkzone z;
+
+ if (copy_from_user(&z, argp, sizeof(struct blkzone)))
+ return -EFAULT;
+
+ zone = blk_lookup_zone(bdev_get_queue(bdev), z.start);
+ if (!zone)
+ return -EINVAL;
+
+ memset(&z, 0, sizeof(struct blkzone));
+
+ blk_lock_zone(zone);
+
+ blk_wait_for_zone_update(zone);
+
+ z.len = zone->len;
+ z.start = zone->start;
+ z.wp = zone->wp;
+ z.type = zone->type;
+ z.cond = zone->cond;
+ z.non_seq = zone->non_seq;
+ z.reset = zone->reset;
+
+ blk_unlock_zone(zone);
+
+ if (copy_to_user(argp, &z, sizeof(struct blkzone)))
+ return -EFAULT;
+
+ return 0;
+}
+
+static int blkdev_zone_action_ioctl(struct block_device *bdev,
+ unsigned cmd, void __user *argp)
+{
+ unsigned int op;
+ u64 sector;
+
+ if (get_user(sector, (u64 __user *)argp))
+ return -EFAULT;
+
+ switch (cmd) {
+ case BLKRESETZONE:
+ op = REQ_OP_ZONE_RESET;
+ break;
+ case BLKOPENZONE:
+ op = REQ_OP_ZONE_OPEN;
+ break;
+ case BLKCLOSEZONE:
+ op = REQ_OP_ZONE_CLOSE;
+ break;
+ case BLKFINISHZONE:
+ op = REQ_OP_ZONE_FINISH;
+ break;
+ }
+
+ return blkdev_issue_zone_action(bdev, sector, op, GFP_KERNEL);
+}
+
+/**
+ * Called from blkdev_ioctl.
+ */
+int blkdev_zone_ioctl(struct block_device *bdev, fmode_t mode,
+ unsigned cmd, unsigned long arg)
+{
+ void __user *argp = (void __user *)arg;
+ struct request_queue *q;
+ int ret;
+
+ if (!argp)
+ return -EINVAL;
+
+ q = bdev_get_queue(bdev);
+ if (!q)
+ return -ENXIO;
+
+ if (!blk_queue_zoned(q))
+ return -ENOTTY;
+
+ if (!capable(CAP_SYS_ADMIN))
+ return -EACCES;
+
+ switch (cmd) {
+ case BLKREPORTZONE:
+ ret = blkdev_report_zone_ioctl(bdev, argp);
+ break;
+ case BLKUPDATEZONES:
+ if (!(mode & FMODE_WRITE)) {
+ ret = -EBADF;
+ break;
+ }
+ ret = blkdev_update_zones(bdev, GFP_KERNEL);
+ break;
+ case BLKRESETZONE:
+ case BLKOPENZONE:
+ case BLKCLOSEZONE:
+ case BLKFINISHZONE:
+ if (!(mode & FMODE_WRITE)) {
+ ret = -EBADF;
+ break;
+ }
+ ret = blkdev_zone_action_ioctl(bdev, cmd, argp);
+ break;
+ default:
+ ret = -ENOTTY;
+ break;
+ }
+
+ return ret;
+}
diff --git a/block/ioctl.c b/block/ioctl.c
index ed2397f..f09679a 100644
--- a/block/ioctl.c
+++ b/block/ioctl.c
@@ -3,6 +3,7 @@
#include <linux/export.h>
#include <linux/gfp.h>
#include <linux/blkpg.h>
+#include <linux/blkzoned.h>
#include <linux/hdreg.h>
#include <linux/backing-dev.h>
#include <linux/fs.h>
@@ -513,6 +514,13 @@ int blkdev_ioctl(struct block_device *bdev, fmode_t mode, unsigned cmd,
BLKDEV_DISCARD_SECURE);
case BLKZEROOUT:
return blk_ioctl_zeroout(bdev, mode, arg);
+ case BLKUPDATEZONES:
+ case BLKREPORTZONE:
+ case BLKRESETZONE:
+ case BLKOPENZONE:
+ case BLKCLOSEZONE:
+ case BLKFINISHZONE:
+ return blkdev_zone_ioctl(bdev, mode, cmd, arg);
case HDIO_GETGEO:
return blkdev_getgeo(bdev, argp);
case BLKRAGET:
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index a85f95b..0299d41 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -405,9 +405,16 @@ extern int blkdev_reset_zone(struct block_device *, sector_t, gfp_t);
extern int blkdev_open_zone(struct block_device *, sector_t, gfp_t);
extern int blkdev_close_zone(struct block_device *, sector_t, gfp_t);
extern int blkdev_finish_zone(struct block_device *, sector_t, gfp_t);
+extern int blkdev_zone_ioctl(struct block_device *, fmode_t, unsigned int,
+ unsigned long);
#else /* CONFIG_BLK_DEV_ZONED */
static inline void blk_init_zones(struct request_queue *q) { };
static inline void blk_drop_zones(struct request_queue *q) { };
+static inline int blkdev_zone_ioctl(struct block_device *bdev, fmode_t mode,
+ unsigned cmd, unsigned long arg)
+{
+ return -ENOTTY;
+}
#endif /* CONFIG_BLK_DEV_ZONED */
struct request_queue {
diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild
index 185f8ea..a2a7522 100644
--- a/include/uapi/linux/Kbuild
+++ b/include/uapi/linux/Kbuild
@@ -70,6 +70,7 @@ header-y += bfs_fs.h
header-y += binfmts.h
header-y += blkpg.h
header-y += blktrace_api.h
+header-y += blkzoned.h
header-y += bpf_common.h
header-y += bpf.h
header-y += bpqether.h
diff --git a/include/uapi/linux/blkzoned.h b/include/uapi/linux/blkzoned.h
new file mode 100644
index 0000000..23a2702
--- /dev/null
+++ b/include/uapi/linux/blkzoned.h
@@ -0,0 +1,91 @@
+/*
+ * Zoned block devices handling.
+ *
+ * Copyright (C) 2015 Seagate Technology PLC
+ *
+ * Written by: Shaun Tancheff <shaun.tancheff@seagate.com>
+ *
+ * Modified by: Damien Le Moal <damien.lemoal@hgst.com>
+ * Copyright (C) 2016 Western Digital
+ *
+ * This file is licensed under the terms of the GNU General Public
+ * License version 2. This program is licensed "as is" without any
+ * warranty of any kind, whether express or implied.
+ */
+#ifndef _UAPI_BLKZONED_H
+#define _UAPI_BLKZONED_H
+
+#include <linux/types.h>
+#include <linux/ioctl.h>
+
+/*
+ * Zone type.
+ */
+enum blkzone_type {
+ BLKZONE_TYPE_UNKNOWN,
+ BLKZONE_TYPE_CONVENTIONAL,
+ BLKZONE_TYPE_SEQWRITE_REQ,
+ BLKZONE_TYPE_SEQWRITE_PREF,
+};
+
+/*
+ * Zone condition.
+ */
+enum blkzone_cond {
+ BLKZONE_COND_NO_WP,
+ BLKZONE_COND_EMPTY,
+ BLKZONE_COND_IMP_OPEN,
+ BLKZONE_COND_EXP_OPEN,
+ BLKZONE_COND_CLOSED,
+ BLKZONE_COND_READONLY = 0xd,
+ BLKZONE_COND_FULL,
+ BLKZONE_COND_OFFLINE,
+};
+
+/*
+ * Zone descriptor for BLKREPORTZONE.
+ * start, len and wp use the regulare 512 B sector unit,
+ * regardless of the device logical block size. The overall
+ * structure size is 64 B to match the ZBC/ZAC defined zone descriptor
+ * and allow support for future additional zone information.
+ */
+struct blkzone {
+ __u64 start; /* Zone start sector */
+ __u64 len; /* Zone length in number of sectors */
+ __u64 wp; /* Zone write pointer position */
+ __u8 type; /* Zone type */
+ __u8 cond; /* Zone condition */
+ __u8 non_seq; /* Non-sequential write resources active */
+ __u8 reset; /* Reset write pointer recommended */
+ __u8 reserved[36];
+};
+
+/*
+ * Zone ioctl's:
+ *
+ * BLKUPDATEZONES : Force update of all zones information
+ * BLKREPORTZONE : Get a zone descriptor. Takes a zone descriptor as
+ * argument. The zone to report is the one
+ * containing the sector initially specified in the
+ * descriptor start field.
+ * BLKRESETZONE : Reset the write pointer of the zone containing the
+ * specified sector, or of all written zones if the
+ * sector is ~0ull.
+ * BLKOPENZONE : Explicitely open the zone containing the
+ * specified sector, or all possible zones if the
+ * sector is ~0ull (the drive determines which zone
+ * to open in this case).
+ * BLKCLOSEZONE : Close the zone containing the specified sector, or
+ * all open zones if the sector is ~0ull.
+ * BLKFINISHZONE : Finish the zone (make it full) containing the
+ * specified sector, or all open and closed zones if
+ * the sector is ~0ull.
+ */
+#define BLKUPDATEZONES _IO(0x12,130)
+#define BLKREPORTZONE _IOWR(0x12,131,struct blkzone)
+#define BLKRESETZONE _IOW(0x12,132,unsigned long long)
+#define BLKOPENZONE _IOW(0x12,133,unsigned long long)
+#define BLKCLOSEZONE _IOW(0x12,134,unsigned long long)
+#define BLKFINISHZONE _IOW(0x12,135,unsigned long long)
+
+#endif /* _UAPI_BLKZONED_H */
diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
index 3b00f7c..1db6d66 100644
--- a/include/uapi/linux/fs.h
+++ b/include/uapi/linux/fs.h
@@ -222,6 +222,7 @@ struct fsxattr {
#define BLKSECDISCARD _IO(0x12,125)
#define BLKROTATIONAL _IO(0x12,126)
#define BLKZEROOUT _IO(0x12,127)
+/* A jump here: 130-135 are used for zoned block devices (see uapi/linux/blkzoned.h) */
#define BMAP_IOCTL 1 /* obsolete - kept for compatibility */
#define FIBMAP _IO(0x00,1) /* bmap access */
--
2.7.4
Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice & Disclaimer:
This e-mail and any files transmitted with it may contain confidential or legally privileged information of WDC and/or its affiliates, and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system.
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 9/9] blk-zoned: Add ioctl interface for zone operations
@ 2016-09-19 21:27 ` Damien Le Moal
0 siblings, 0 replies; 36+ messages in thread
From: Damien Le Moal @ 2016-09-19 21:27 UTC (permalink / raw)
To: linux-scsi, linux-block
Cc: martin.petersen, axboe, hare, shaun.tancheff, Damien Le Moal
From: Shaun Tancheff <shaun.tancheff@seagate.com>
Adds the new BLKUPDATEZONES, BLKREPORTZONE, BLKRESETZONE,
BLKOPENZONE, BLKCLOSEZONE and BLKFINISHZONE ioctls.
BLKREPORTZONE implementation uses the device queue zone RB-tree by
default and no actual command is issued to the device. If the
application needs access to the untracked zone attributes (non-seq
flag or reset recommended flag, offline or read-only zone condition,
etc), BLKUPDATEZONES must be issued first to force an update of the
cached zone information.
Changelog (Damien):
* Simplified blkzone descriptor (removed bit-fields and use CPU
endianness)
* Changed report ioctl to operate on single zone instead of an
array of blkzone structures.
Signed-off-by: Shaun Tancheff <shaun.tancheff@seagate.com>
Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
---
block/blk-zoned.c | 115 ++++++++++++++++++++++++++++++++++++++++++
block/ioctl.c | 8 +++
include/linux/blkdev.h | 7 +++
include/uapi/linux/Kbuild | 1 +
include/uapi/linux/blkzoned.h | 91 +++++++++++++++++++++++++++++++++
include/uapi/linux/fs.h | 1 +
6 files changed, 223 insertions(+)
create mode 100644 include/uapi/linux/blkzoned.h
diff --git a/block/blk-zoned.c b/block/blk-zoned.c
index a107940..71205c8 100644
--- a/block/blk-zoned.c
+++ b/block/blk-zoned.c
@@ -12,6 +12,7 @@
#include <linux/module.h>
#include <linux/rbtree.h>
#include <linux/blkdev.h>
+#include <linux/blkzoned.h>
void blk_init_zones(struct request_queue *q)
{
@@ -336,3 +337,117 @@ int blkdev_finish_zone(struct block_device *bdev,
return blkdev_issue_zone_action(bdev, sector, REQ_OP_ZONE_FINISH,
gfp_mask);
}
+
+static int blkdev_report_zone_ioctl(struct block_device *bdev,
+ void __user *argp)
+{
+ struct blk_zone *zone;
+ struct blkzone z;
+
+ if (copy_from_user(&z, argp, sizeof(struct blkzone)))
+ return -EFAULT;
+
+ zone = blk_lookup_zone(bdev_get_queue(bdev), z.start);
+ if (!zone)
+ return -EINVAL;
+
+ memset(&z, 0, sizeof(struct blkzone));
+
+ blk_lock_zone(zone);
+
+ blk_wait_for_zone_update(zone);
+
+ z.len = zone->len;
+ z.start = zone->start;
+ z.wp = zone->wp;
+ z.type = zone->type;
+ z.cond = zone->cond;
+ z.non_seq = zone->non_seq;
+ z.reset = zone->reset;
+
+ blk_unlock_zone(zone);
+
+ if (copy_to_user(argp, &z, sizeof(struct blkzone)))
+ return -EFAULT;
+
+ return 0;
+}
+
+static int blkdev_zone_action_ioctl(struct block_device *bdev,
+ unsigned cmd, void __user *argp)
+{
+ unsigned int op;
+ u64 sector;
+
+ if (get_user(sector, (u64 __user *)argp))
+ return -EFAULT;
+
+ switch (cmd) {
+ case BLKRESETZONE:
+ op = REQ_OP_ZONE_RESET;
+ break;
+ case BLKOPENZONE:
+ op = REQ_OP_ZONE_OPEN;
+ break;
+ case BLKCLOSEZONE:
+ op = REQ_OP_ZONE_CLOSE;
+ break;
+ case BLKFINISHZONE:
+ op = REQ_OP_ZONE_FINISH;
+ break;
+ }
+
+ return blkdev_issue_zone_action(bdev, sector, op, GFP_KERNEL);
+}
+
+/**
+ * Called from blkdev_ioctl.
+ */
+int blkdev_zone_ioctl(struct block_device *bdev, fmode_t mode,
+ unsigned cmd, unsigned long arg)
+{
+ void __user *argp = (void __user *)arg;
+ struct request_queue *q;
+ int ret;
+
+ if (!argp)
+ return -EINVAL;
+
+ q = bdev_get_queue(bdev);
+ if (!q)
+ return -ENXIO;
+
+ if (!blk_queue_zoned(q))
+ return -ENOTTY;
+
+ if (!capable(CAP_SYS_ADMIN))
+ return -EACCES;
+
+ switch (cmd) {
+ case BLKREPORTZONE:
+ ret = blkdev_report_zone_ioctl(bdev, argp);
+ break;
+ case BLKUPDATEZONES:
+ if (!(mode & FMODE_WRITE)) {
+ ret = -EBADF;
+ break;
+ }
+ ret = blkdev_update_zones(bdev, GFP_KERNEL);
+ break;
+ case BLKRESETZONE:
+ case BLKOPENZONE:
+ case BLKCLOSEZONE:
+ case BLKFINISHZONE:
+ if (!(mode & FMODE_WRITE)) {
+ ret = -EBADF;
+ break;
+ }
+ ret = blkdev_zone_action_ioctl(bdev, cmd, argp);
+ break;
+ default:
+ ret = -ENOTTY;
+ break;
+ }
+
+ return ret;
+}
diff --git a/block/ioctl.c b/block/ioctl.c
index ed2397f..f09679a 100644
--- a/block/ioctl.c
+++ b/block/ioctl.c
@@ -3,6 +3,7 @@
#include <linux/export.h>
#include <linux/gfp.h>
#include <linux/blkpg.h>
+#include <linux/blkzoned.h>
#include <linux/hdreg.h>
#include <linux/backing-dev.h>
#include <linux/fs.h>
@@ -513,6 +514,13 @@ int blkdev_ioctl(struct block_device *bdev, fmode_t mode, unsigned cmd,
BLKDEV_DISCARD_SECURE);
case BLKZEROOUT:
return blk_ioctl_zeroout(bdev, mode, arg);
+ case BLKUPDATEZONES:
+ case BLKREPORTZONE:
+ case BLKRESETZONE:
+ case BLKOPENZONE:
+ case BLKCLOSEZONE:
+ case BLKFINISHZONE:
+ return blkdev_zone_ioctl(bdev, mode, cmd, arg);
case HDIO_GETGEO:
return blkdev_getgeo(bdev, argp);
case BLKRAGET:
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index a85f95b..0299d41 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -405,9 +405,16 @@ extern int blkdev_reset_zone(struct block_device *, sector_t, gfp_t);
extern int blkdev_open_zone(struct block_device *, sector_t, gfp_t);
extern int blkdev_close_zone(struct block_device *, sector_t, gfp_t);
extern int blkdev_finish_zone(struct block_device *, sector_t, gfp_t);
+extern int blkdev_zone_ioctl(struct block_device *, fmode_t, unsigned int,
+ unsigned long);
#else /* CONFIG_BLK_DEV_ZONED */
static inline void blk_init_zones(struct request_queue *q) { };
static inline void blk_drop_zones(struct request_queue *q) { };
+static inline int blkdev_zone_ioctl(struct block_device *bdev, fmode_t mode,
+ unsigned cmd, unsigned long arg)
+{
+ return -ENOTTY;
+}
#endif /* CONFIG_BLK_DEV_ZONED */
struct request_queue {
diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild
index 185f8ea..a2a7522 100644
--- a/include/uapi/linux/Kbuild
+++ b/include/uapi/linux/Kbuild
@@ -70,6 +70,7 @@ header-y += bfs_fs.h
header-y += binfmts.h
header-y += blkpg.h
header-y += blktrace_api.h
+header-y += blkzoned.h
header-y += bpf_common.h
header-y += bpf.h
header-y += bpqether.h
diff --git a/include/uapi/linux/blkzoned.h b/include/uapi/linux/blkzoned.h
new file mode 100644
index 0000000..23a2702
--- /dev/null
+++ b/include/uapi/linux/blkzoned.h
@@ -0,0 +1,91 @@
+/*
+ * Zoned block devices handling.
+ *
+ * Copyright (C) 2015 Seagate Technology PLC
+ *
+ * Written by: Shaun Tancheff <shaun.tancheff@seagate.com>
+ *
+ * Modified by: Damien Le Moal <damien.lemoal@hgst.com>
+ * Copyright (C) 2016 Western Digital
+ *
+ * This file is licensed under the terms of the GNU General Public
+ * License version 2. This program is licensed "as is" without any
+ * warranty of any kind, whether express or implied.
+ */
+#ifndef _UAPI_BLKZONED_H
+#define _UAPI_BLKZONED_H
+
+#include <linux/types.h>
+#include <linux/ioctl.h>
+
+/*
+ * Zone type.
+ */
+enum blkzone_type {
+ BLKZONE_TYPE_UNKNOWN,
+ BLKZONE_TYPE_CONVENTIONAL,
+ BLKZONE_TYPE_SEQWRITE_REQ,
+ BLKZONE_TYPE_SEQWRITE_PREF,
+};
+
+/*
+ * Zone condition.
+ */
+enum blkzone_cond {
+ BLKZONE_COND_NO_WP,
+ BLKZONE_COND_EMPTY,
+ BLKZONE_COND_IMP_OPEN,
+ BLKZONE_COND_EXP_OPEN,
+ BLKZONE_COND_CLOSED,
+ BLKZONE_COND_READONLY = 0xd,
+ BLKZONE_COND_FULL,
+ BLKZONE_COND_OFFLINE,
+};
+
+/*
+ * Zone descriptor for BLKREPORTZONE.
+ * start, len and wp use the regulare 512 B sector unit,
+ * regardless of the device logical block size. The overall
+ * structure size is 64 B to match the ZBC/ZAC defined zone descriptor
+ * and allow support for future additional zone information.
+ */
+struct blkzone {
+ __u64 start; /* Zone start sector */
+ __u64 len; /* Zone length in number of sectors */
+ __u64 wp; /* Zone write pointer position */
+ __u8 type; /* Zone type */
+ __u8 cond; /* Zone condition */
+ __u8 non_seq; /* Non-sequential write resources active */
+ __u8 reset; /* Reset write pointer recommended */
+ __u8 reserved[36];
+};
+
+/*
+ * Zone ioctl's:
+ *
+ * BLKUPDATEZONES : Force update of all zones information
+ * BLKREPORTZONE : Get a zone descriptor. Takes a zone descriptor as
+ * argument. The zone to report is the one
+ * containing the sector initially specified in the
+ * descriptor start field.
+ * BLKRESETZONE : Reset the write pointer of the zone containing the
+ * specified sector, or of all written zones if the
+ * sector is ~0ull.
+ * BLKOPENZONE : Explicitely open the zone containing the
+ * specified sector, or all possible zones if the
+ * sector is ~0ull (the drive determines which zone
+ * to open in this case).
+ * BLKCLOSEZONE : Close the zone containing the specified sector, or
+ * all open zones if the sector is ~0ull.
+ * BLKFINISHZONE : Finish the zone (make it full) containing the
+ * specified sector, or all open and closed zones if
+ * the sector is ~0ull.
+ */
+#define BLKUPDATEZONES _IO(0x12,130)
+#define BLKREPORTZONE _IOWR(0x12,131,struct blkzone)
+#define BLKRESETZONE _IOW(0x12,132,unsigned long long)
+#define BLKOPENZONE _IOW(0x12,133,unsigned long long)
+#define BLKCLOSEZONE _IOW(0x12,134,unsigned long long)
+#define BLKFINISHZONE _IOW(0x12,135,unsigned long long)
+
+#endif /* _UAPI_BLKZONED_H */
diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
index 3b00f7c..1db6d66 100644
--- a/include/uapi/linux/fs.h
+++ b/include/uapi/linux/fs.h
@@ -222,6 +222,7 @@ struct fsxattr {
#define BLKSECDISCARD _IO(0x12,125)
#define BLKROTATIONAL _IO(0x12,126)
#define BLKZEROOUT _IO(0x12,127)
+/* A jump here: 130-135 are used for zoned block devices (see uapi/linux/blkzoned.h) */
#define BMAP_IOCTL 1 /* obsolete - kept for compatibility */
#define FIBMAP _IO(0x00,1) /* bmap access */
--
2.7.4
^ permalink raw reply related [flat|nested] 36+ messages in thread
* Re: [PATCH 8/9] sd: Implement support for ZBC devices
2016-09-19 21:27 ` Damien Le Moal
@ 2016-09-20 0:08 ` kbuild test robot
-1 siblings, 0 replies; 36+ messages in thread
From: kbuild test robot @ 2016-09-20 0:08 UTC (permalink / raw)
To: Damien Le Moal
Cc: kbuild-all, linux-scsi, linux-block, martin.petersen, axboe,
hare, shaun.tancheff, Hannes Reinecke, Damien Le Moal
[-- Attachment #1: Type: text/plain, Size: 30989 bytes --]
Hi Hannes,
[auto build test WARNING on linus/master]
[also build test WARNING on v4.8-rc7]
[cannot apply to block/for-next next-20160919]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
[Suggest to use git(>=2.9.0) format-patch --base=<commit> (or --base=auto for convenience) to record what (public, well-known) commit your patch series was built on]
[Check https://git-scm.com/docs/git-format-patch for more information]
url: https://github.com/0day-ci/linux/commits/Damien-Le-Moal/ZBC-Zoned-block-device-support/20160920-062608
config: i386-allmodconfig (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
# save the attached .config to linux build tree
make ARCH=i386
All warnings (new ones prefixed by >>):
In file included from include/linux/kernel.h:13:0,
from include/linux/sched.h:17,
from include/linux/blkdev.h:4,
from drivers/scsi/sd_zbc.c:24:
drivers/scsi/sd_zbc.c: In function 'sd_zbc_report_zones':
>> drivers/scsi/sd_zbc.c:61:11: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
pr_debug("%s %s [%s]: " fmt, \
^
include/linux/printk.h:260:21: note: in definition of macro 'pr_fmt'
#define pr_fmt(fmt) fmt
^~~
include/linux/printk.h:308:2: note: in expansion of macro 'dynamic_pr_debug'
dynamic_pr_debug(fmt, ##__VA_ARGS__)
^~~~~~~~~~~~~~~~
>> drivers/scsi/sd_zbc.c:61:2: note: in expansion of macro 'pr_debug'
pr_debug("%s %s [%s]: " fmt, \
^~~~~~~~
>> drivers/scsi/sd_zbc.c:221:2: note: in expansion of macro 'sd_zbc_debug'
sd_zbc_debug(sdkp, "REPORT ZONES lba %zu len %d\n",
^~~~~~~~~~~~
In file included from include/linux/printk.h:6:0,
from include/linux/kernel.h:13,
from include/linux/sched.h:17,
from include/linux/blkdev.h:4,
from drivers/scsi/sd_zbc.c:24:
>> include/linux/kern_levels.h:4:18: warning: format '%zu' expects argument of type 'size_t', but argument 5 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
#define KERN_SOH "\001" /* ASCII Start Of Header */
^
include/linux/kern_levels.h:10:18: note: in expansion of macro 'KERN_SOH'
#define KERN_ERR KERN_SOH "3" /* error conditions */
^~~~~~~~
include/linux/printk.h:276:9: note: in expansion of macro 'KERN_ERR'
printk(KERN_ERR pr_fmt(fmt), ##__VA_ARGS__)
^~~~~~~~
>> drivers/scsi/sd_zbc.c:73:2: note: in expansion of macro 'pr_err'
pr_err("%s %s [%s]: " fmt, \
^~~~~~
>> drivers/scsi/sd_zbc.c:237:3: note: in expansion of macro 'sd_zbc_err'
sd_zbc_err(sdkp,
^~~~~~~~~~
In file included from include/linux/kernel.h:13:0,
from include/linux/sched.h:17,
from include/linux/blkdev.h:4,
from drivers/scsi/sd_zbc.c:24:
drivers/scsi/sd_zbc.c: In function 'sd_zbc_setup_reset_cmnd':
>> drivers/scsi/sd_zbc.c:61:11: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
pr_debug("%s %s [%s]: " fmt, \
^
include/linux/printk.h:260:21: note: in definition of macro 'pr_fmt'
#define pr_fmt(fmt) fmt
^~~
include/linux/printk.h:308:2: note: in expansion of macro 'dynamic_pr_debug'
dynamic_pr_debug(fmt, ##__VA_ARGS__)
^~~~~~~~~~~~~~~~
>> drivers/scsi/sd_zbc.c:61:2: note: in expansion of macro 'pr_debug'
pr_debug("%s %s [%s]: " fmt, \
^~~~~~~~
drivers/scsi/sd_zbc.c:489:4: note: in expansion of macro 'sd_zbc_debug'
sd_zbc_debug(sdkp,
^~~~~~~~~~~~
In file included from drivers/scsi/sd_zbc.c:37:0:
drivers/scsi/sd_zbc.c:517:7: warning: format '%zu' expects argument of type 'size_t', but argument 5 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned reset wp request, start %zu/%zu"
^
drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
(sdsk)->disk->disk_name, fmt, ##a) : \
^~~
drivers/scsi/sd_zbc.c:517:7: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned reset wp request, start %zu/%zu"
^
drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
(sdsk)->disk->disk_name, fmt, ##a) : \
^~~
drivers/scsi/sd_zbc.c:517:7: warning: format '%zu' expects argument of type 'size_t', but argument 7 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned reset wp request, start %zu/%zu"
^
drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
(sdsk)->disk->disk_name, fmt, ##a) : \
^~~
drivers/scsi/sd_zbc.c:517:7: warning: format '%zu' expects argument of type 'size_t', but argument 8 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned reset wp request, start %zu/%zu"
^
drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
(sdsk)->disk->disk_name, fmt, ##a) : \
^~~
In file included from include/scsi/scsi_cmnd.h:10:0,
from drivers/scsi/sd_zbc.c:30:
drivers/scsi/sd_zbc.c:517:7: warning: format '%zu' expects argument of type 'size_t', but argument 5 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned reset wp request, start %zu/%zu"
^
include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
^~~
>> drivers/scsi/sd_zbc.c:516:4: note: in expansion of macro 'sd_printk'
sd_printk(KERN_ERR, sdkp,
^~~~~~~~~
drivers/scsi/sd_zbc.c:517:7: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned reset wp request, start %zu/%zu"
^
include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
^~~
>> drivers/scsi/sd_zbc.c:516:4: note: in expansion of macro 'sd_printk'
sd_printk(KERN_ERR, sdkp,
^~~~~~~~~
drivers/scsi/sd_zbc.c:517:7: warning: format '%zu' expects argument of type 'size_t', but argument 7 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned reset wp request, start %zu/%zu"
^
include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
^~~
>> drivers/scsi/sd_zbc.c:516:4: note: in expansion of macro 'sd_printk'
sd_printk(KERN_ERR, sdkp,
^~~~~~~~~
drivers/scsi/sd_zbc.c:517:7: warning: format '%zu' expects argument of type 'size_t', but argument 8 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned reset wp request, start %zu/%zu"
^
include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
^~~
>> drivers/scsi/sd_zbc.c:516:4: note: in expansion of macro 'sd_printk'
sd_printk(KERN_ERR, sdkp,
^~~~~~~~~
In file included from include/linux/kernel.h:13:0,
from include/linux/sched.h:17,
from include/linux/blkdev.h:4,
from drivers/scsi/sd_zbc.c:24:
drivers/scsi/sd_zbc.c: In function 'sd_zbc_setup_open_cmnd':
>> drivers/scsi/sd_zbc.c:61:11: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
pr_debug("%s %s [%s]: " fmt, \
^
include/linux/printk.h:260:21: note: in definition of macro 'pr_fmt'
#define pr_fmt(fmt) fmt
^~~
include/linux/printk.h:308:2: note: in expansion of macro 'dynamic_pr_debug'
dynamic_pr_debug(fmt, ##__VA_ARGS__)
^~~~~~~~~~~~~~~~
>> drivers/scsi/sd_zbc.c:61:2: note: in expansion of macro 'pr_debug'
pr_debug("%s %s [%s]: " fmt, \
^~~~~~~~
drivers/scsi/sd_zbc.c:573:4: note: in expansion of macro 'sd_zbc_debug'
sd_zbc_debug(sdkp,
^~~~~~~~~~~~
In file included from drivers/scsi/sd_zbc.c:37:0:
drivers/scsi/sd_zbc.c:594:7: warning: format '%zu' expects argument of type 'size_t', but argument 5 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned open zone request, start %zu/%zu"
^
drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
(sdsk)->disk->disk_name, fmt, ##a) : \
^~~
drivers/scsi/sd_zbc.c:594:7: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned open zone request, start %zu/%zu"
^
drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
(sdsk)->disk->disk_name, fmt, ##a) : \
^~~
drivers/scsi/sd_zbc.c:594:7: warning: format '%zu' expects argument of type 'size_t', but argument 7 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned open zone request, start %zu/%zu"
^
drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
(sdsk)->disk->disk_name, fmt, ##a) : \
^~~
drivers/scsi/sd_zbc.c:594:7: warning: format '%zu' expects argument of type 'size_t', but argument 8 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned open zone request, start %zu/%zu"
^
drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
(sdsk)->disk->disk_name, fmt, ##a) : \
^~~
In file included from include/scsi/scsi_cmnd.h:10:0,
from drivers/scsi/sd_zbc.c:30:
drivers/scsi/sd_zbc.c:594:7: warning: format '%zu' expects argument of type 'size_t', but argument 5 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned open zone request, start %zu/%zu"
^
include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
^~~
drivers/scsi/sd_zbc.c:593:4: note: in expansion of macro 'sd_printk'
sd_printk(KERN_ERR, sdkp,
^~~~~~~~~
drivers/scsi/sd_zbc.c:594:7: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned open zone request, start %zu/%zu"
^
include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
^~~
drivers/scsi/sd_zbc.c:593:4: note: in expansion of macro 'sd_printk'
sd_printk(KERN_ERR, sdkp,
^~~~~~~~~
drivers/scsi/sd_zbc.c:594:7: warning: format '%zu' expects argument of type 'size_t', but argument 7 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned open zone request, start %zu/%zu"
^
include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
^~~
drivers/scsi/sd_zbc.c:593:4: note: in expansion of macro 'sd_printk'
sd_printk(KERN_ERR, sdkp,
^~~~~~~~~
drivers/scsi/sd_zbc.c:594:7: warning: format '%zu' expects argument of type 'size_t', but argument 8 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned open zone request, start %zu/%zu"
^
include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
^~~
drivers/scsi/sd_zbc.c:593:4: note: in expansion of macro 'sd_printk'
sd_printk(KERN_ERR, sdkp,
^~~~~~~~~
In file included from include/linux/kernel.h:13:0,
from include/linux/sched.h:17,
from include/linux/blkdev.h:4,
from drivers/scsi/sd_zbc.c:24:
drivers/scsi/sd_zbc.c: In function 'sd_zbc_setup_close_cmnd':
>> drivers/scsi/sd_zbc.c:61:11: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
pr_debug("%s %s [%s]: " fmt, \
^
include/linux/printk.h:260:21: note: in definition of macro 'pr_fmt'
#define pr_fmt(fmt) fmt
^~~
include/linux/printk.h:308:2: note: in expansion of macro 'dynamic_pr_debug'
dynamic_pr_debug(fmt, ##__VA_ARGS__)
^~~~~~~~~~~~~~~~
>> drivers/scsi/sd_zbc.c:61:2: note: in expansion of macro 'pr_debug'
pr_debug("%s %s [%s]: " fmt, \
^~~~~~~~
drivers/scsi/sd_zbc.c:645:4: note: in expansion of macro 'sd_zbc_debug'
sd_zbc_debug(sdkp,
^~~~~~~~~~~~
In file included from drivers/scsi/sd_zbc.c:37:0:
drivers/scsi/sd_zbc.c:666:7: warning: format '%zu' expects argument of type 'size_t', but argument 5 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned close zone request, start %zu/%zu"
^
drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
(sdsk)->disk->disk_name, fmt, ##a) : \
^~~
drivers/scsi/sd_zbc.c:666:7: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned close zone request, start %zu/%zu"
^
drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
(sdsk)->disk->disk_name, fmt, ##a) : \
^~~
drivers/scsi/sd_zbc.c:666:7: warning: format '%zu' expects argument of type 'size_t', but argument 7 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned close zone request, start %zu/%zu"
^
drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
(sdsk)->disk->disk_name, fmt, ##a) : \
^~~
drivers/scsi/sd_zbc.c:666:7: warning: format '%zu' expects argument of type 'size_t', but argument 8 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned close zone request, start %zu/%zu"
^
drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
(sdsk)->disk->disk_name, fmt, ##a) : \
^~~
In file included from include/scsi/scsi_cmnd.h:10:0,
from drivers/scsi/sd_zbc.c:30:
drivers/scsi/sd_zbc.c:666:7: warning: format '%zu' expects argument of type 'size_t', but argument 5 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned close zone request, start %zu/%zu"
^
include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
^~~
drivers/scsi/sd_zbc.c:665:4: note: in expansion of macro 'sd_printk'
sd_printk(KERN_ERR, sdkp,
^~~~~~~~~
drivers/scsi/sd_zbc.c:666:7: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned close zone request, start %zu/%zu"
^
include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
^~~
drivers/scsi/sd_zbc.c:665:4: note: in expansion of macro 'sd_printk'
sd_printk(KERN_ERR, sdkp,
^~~~~~~~~
drivers/scsi/sd_zbc.c:666:7: warning: format '%zu' expects argument of type 'size_t', but argument 7 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned close zone request, start %zu/%zu"
^
include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
^~~
drivers/scsi/sd_zbc.c:665:4: note: in expansion of macro 'sd_printk'
sd_printk(KERN_ERR, sdkp,
^~~~~~~~~
drivers/scsi/sd_zbc.c:666:7: warning: format '%zu' expects argument of type 'size_t', but argument 8 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned close zone request, start %zu/%zu"
^
include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
^~~
drivers/scsi/sd_zbc.c:665:4: note: in expansion of macro 'sd_printk'
sd_printk(KERN_ERR, sdkp,
^~~~~~~~~
In file included from include/linux/kernel.h:13:0,
from include/linux/sched.h:17,
from include/linux/blkdev.h:4,
from drivers/scsi/sd_zbc.c:24:
drivers/scsi/sd_zbc.c: In function 'sd_zbc_setup_finish_cmnd':
>> drivers/scsi/sd_zbc.c:61:11: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
pr_debug("%s %s [%s]: " fmt, \
^
include/linux/printk.h:260:21: note: in definition of macro 'pr_fmt'
#define pr_fmt(fmt) fmt
^~~
include/linux/printk.h:308:2: note: in expansion of macro 'dynamic_pr_debug'
dynamic_pr_debug(fmt, ##__VA_ARGS__)
^~~~~~~~~~~~~~~~
>> drivers/scsi/sd_zbc.c:61:2: note: in expansion of macro 'pr_debug'
pr_debug("%s %s [%s]: " fmt, \
^~~~~~~~
drivers/scsi/sd_zbc.c:717:4: note: in expansion of macro 'sd_zbc_debug'
sd_zbc_debug(sdkp,
^~~~~~~~~~~~
In file included from drivers/scsi/sd_zbc.c:37:0:
drivers/scsi/sd_zbc.c:734:7: warning: format '%zu' expects argument of type 'size_t', but argument 5 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned finish zone request, start %zu/%zu"
^
drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
(sdsk)->disk->disk_name, fmt, ##a) : \
^~~
drivers/scsi/sd_zbc.c:734:7: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned finish zone request, start %zu/%zu"
^
drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
(sdsk)->disk->disk_name, fmt, ##a) : \
^~~
drivers/scsi/sd_zbc.c:734:7: warning: format '%zu' expects argument of type 'size_t', but argument 7 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned finish zone request, start %zu/%zu"
^
drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
(sdsk)->disk->disk_name, fmt, ##a) : \
^~~
drivers/scsi/sd_zbc.c:734:7: warning: format '%zu' expects argument of type 'size_t', but argument 8 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned finish zone request, start %zu/%zu"
^
drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
(sdsk)->disk->disk_name, fmt, ##a) : \
^~~
In file included from include/scsi/scsi_cmnd.h:10:0,
from drivers/scsi/sd_zbc.c:30:
drivers/scsi/sd_zbc.c:734:7: warning: format '%zu' expects argument of type 'size_t', but argument 5 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned finish zone request, start %zu/%zu"
^
include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
^~~
drivers/scsi/sd_zbc.c:733:4: note: in expansion of macro 'sd_printk'
sd_printk(KERN_ERR, sdkp,
^~~~~~~~~
drivers/scsi/sd_zbc.c:734:7: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned finish zone request, start %zu/%zu"
^
include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
^~~
drivers/scsi/sd_zbc.c:733:4: note: in expansion of macro 'sd_printk'
sd_printk(KERN_ERR, sdkp,
^~~~~~~~~
drivers/scsi/sd_zbc.c:734:7: warning: format '%zu' expects argument of type 'size_t', but argument 7 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned finish zone request, start %zu/%zu"
^
include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
^~~
drivers/scsi/sd_zbc.c:733:4: note: in expansion of macro 'sd_printk'
sd_printk(KERN_ERR, sdkp,
^~~~~~~~~
drivers/scsi/sd_zbc.c:734:7: warning: format '%zu' expects argument of type 'size_t', but argument 8 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned finish zone request, start %zu/%zu"
^
include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
^~~
drivers/scsi/sd_zbc.c:733:4: note: in expansion of macro 'sd_printk'
sd_printk(KERN_ERR, sdkp,
^~~~~~~~~
In file included from include/linux/kernel.h:13:0,
from include/linux/sched.h:17,
from include/linux/blkdev.h:4,
from drivers/scsi/sd_zbc.c:24:
drivers/scsi/sd_zbc.c: In function 'sd_zbc_setup_read_write':
>> drivers/scsi/sd_zbc.c:61:11: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
pr_debug("%s %s [%s]: " fmt, \
^
include/linux/printk.h:260:21: note: in definition of macro 'pr_fmt'
#define pr_fmt(fmt) fmt
^~~
include/linux/printk.h:308:2: note: in expansion of macro 'dynamic_pr_debug'
dynamic_pr_debug(fmt, ##__VA_ARGS__)
^~~~~~~~~~~~~~~~
>> drivers/scsi/sd_zbc.c:61:2: note: in expansion of macro 'pr_debug'
pr_debug("%s %s [%s]: " fmt, \
^~~~~~~~
drivers/scsi/sd_zbc.c:783:3: note: in expansion of macro 'sd_zbc_debug'
sd_zbc_debug(sdkp,
^~~~~~~~~~~~
In file included from drivers/scsi/sd_zbc.c:37:0:
drivers/scsi/sd_zbc.c: In function 'sd_zbc_read_zones':
drivers/scsi/sd_zbc.c:985:8: warning: format '%zu' expects argument of type 'size_t', but argument 5 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Changing capacity from %zu "
^
drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
(sdsk)->disk->disk_name, fmt, ##a) : \
^~~
vim +61 drivers/scsi/sd_zbc.c
18 * along with this program; see the file COPYING. If not, write to
19 * the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139,
20 * USA.
21 *
22 */
23
> 24 #include <linux/blkdev.h>
25 #include <linux/rbtree.h>
26
27 #include <asm/unaligned.h>
28
29 #include <scsi/scsi.h>
> 30 #include <scsi/scsi_cmnd.h>
31 #include <scsi/scsi_dbg.h>
32 #include <scsi/scsi_device.h>
33 #include <scsi/scsi_driver.h>
34 #include <scsi/scsi_host.h>
35 #include <scsi/scsi_eh.h>
36
> 37 #include "sd.h"
38 #include "scsi_priv.h"
39
40 enum zbc_zone_type {
41 ZBC_ZONE_TYPE_CONV = 0x1,
42 ZBC_ZONE_TYPE_SEQWRITE_REQ,
43 ZBC_ZONE_TYPE_SEQWRITE_PREF,
44 ZBC_ZONE_TYPE_RESERVED,
45 };
46
47 enum zbc_zone_cond {
48 ZBC_ZONE_COND_NO_WP,
49 ZBC_ZONE_COND_EMPTY,
50 ZBC_ZONE_COND_IMP_OPEN,
51 ZBC_ZONE_COND_EXP_OPEN,
52 ZBC_ZONE_COND_CLOSED,
53 ZBC_ZONE_COND_READONLY = 0xd,
54 ZBC_ZONE_COND_FULL,
55 ZBC_ZONE_COND_OFFLINE,
56 };
57
58 #define SD_ZBC_BUF_SIZE 131072
59
60 #define sd_zbc_debug(sdkp, fmt, args...) \
> 61 pr_debug("%s %s [%s]: " fmt, \
62 dev_driver_string(&(sdkp)->device->sdev_gendev), \
63 dev_name(&(sdkp)->device->sdev_gendev), \
64 (sdkp)->disk->disk_name, ## args)
65
66 #define sd_zbc_debug_ratelimit(sdkp, fmt, args...) \
67 do { \
68 if (printk_ratelimit()) \
69 sd_zbc_debug(sdkp, fmt, ## args); \
70 } while( 0 )
71
72 #define sd_zbc_err(sdkp, fmt, args...) \
> 73 pr_err("%s %s [%s]: " fmt, \
74 dev_driver_string(&(sdkp)->device->sdev_gendev), \
75 dev_name(&(sdkp)->device->sdev_gendev), \
76 (sdkp)->disk->disk_name, ## args)
77
78 struct zbc_zone_work {
79 struct work_struct zone_work;
80 struct scsi_disk *sdkp;
81 sector_t sector;
82 sector_t nr_sects;
83 bool init;
84 unsigned int nr_zones;
85 };
86
87 struct blk_zone *zbc_desc_to_zone(struct scsi_disk *sdkp, unsigned char *rec)
88 {
89 struct blk_zone *zone;
90
91 zone = kzalloc(sizeof(struct blk_zone), GFP_KERNEL);
92 if (!zone)
93 return NULL;
94
95 /* Zone type */
96 switch(rec[0] & 0x0f) {
97 case ZBC_ZONE_TYPE_CONV:
98 case ZBC_ZONE_TYPE_SEQWRITE_REQ:
99 case ZBC_ZONE_TYPE_SEQWRITE_PREF:
100 zone->type = rec[0] & 0x0f;
101 break;
102 default:
103 zone->type = BLK_ZONE_TYPE_UNKNOWN;
104 break;
105 }
106
107 /* Zone condition */
108 zone->cond = (rec[1] >> 4) & 0xf;
109 if (rec[1] & 0x01)
110 zone->reset = 1;
111 if (rec[1] & 0x02)
112 zone->non_seq = 1;
113
114 /* Zone start sector and length */
115 zone->len = logical_to_sectors(sdkp->device,
116 get_unaligned_be64(&rec[8]));
117 zone->start = logical_to_sectors(sdkp->device,
118 get_unaligned_be64(&rec[16]));
119
120 /* Zone write pointer */
121 if (blk_zone_is_empty(zone) &&
122 zone->wp != zone->start)
123 zone->wp = zone->start;
124 else if (blk_zone_is_full(zone))
125 zone->wp = zone->start + zone->len;
126 else if (blk_zone_is_seq(zone))
127 zone->wp = logical_to_sectors(sdkp->device,
128 get_unaligned_be64(&rec[24]));
129 else
130 zone->wp = (sector_t)-1;
131
132 return zone;
133 }
134
135 static int zbc_parse_zones(struct scsi_disk *sdkp, unsigned char *buf,
136 unsigned int buf_len, sector_t *next_sector)
137 {
138 struct request_queue *q = sdkp->disk->queue;
139 sector_t capacity = logical_to_sectors(sdkp->device, sdkp->capacity);
140 unsigned char *rec = buf;
141 unsigned int zone_len, list_length;
142
143 /* Parse REPORT ZONES header */
144 list_length = get_unaligned_be32(&buf[0]);
145 rec = buf + 64;
146 list_length += 64;
147
148 if (list_length < buf_len)
149 buf_len = list_length;
150
151 /* Parse REPORT ZONES zone descriptors */
152 *next_sector = capacity;
153 while (rec < buf + buf_len) {
154
155 struct blk_zone *new, *old;
156
157 new = zbc_desc_to_zone(sdkp, rec);
158 if (!new)
159 return -ENOMEM;
160
161 zone_len = new->len;
162 *next_sector = new->start + zone_len;
163
164 old = blk_insert_zone(q, new);
165 if (old) {
166 blk_lock_zone(old);
167
168 /*
169 * Always update the zone state flags and the zone
170 * offline and read-only condition as the drive may
171 * change those independently of the commands being
172 * executed
173 */
174 old->reset = new->reset;
175 old->non_seq = new->non_seq;
176 if (blk_zone_is_offline(new) ||
177 blk_zone_is_readonly(new))
178 old->cond = new->cond;
179
180 if (blk_zone_in_update(old)) {
181 old->cond = new->cond;
182 old->wp = new->wp;
183 blk_clear_zone_update(old);
184 }
185
186 blk_unlock_zone(old);
187
188 kfree(new);
189 }
190
191 rec += 64;
192
193 }
194
195 return 0;
196 }
197
198 /**
199 * sd_zbc_report_zones - Issue a REPORT ZONES scsi command
200 * @sdkp: SCSI disk to which the command should be send
201 * @buffer: response buffer
202 * @bufflen: length of @buffer
203 * @start_sector: logical sector for the zone information should be reported
204 * @option: reporting option to be used
205 * @partial: flag to set the 'partial' bit for report zones command
206 */
207 int sd_zbc_report_zones(struct scsi_disk *sdkp, unsigned char *buffer,
208 int bufflen, sector_t start_sector,
209 enum zbc_zone_reporting_options option, bool partial)
210 {
211 struct scsi_device *sdp = sdkp->device;
212 const int timeout = sdp->request_queue->rq_timeout;
213 struct scsi_sense_hdr sshdr;
214 sector_t start_lba = sectors_to_logical(sdkp->device, start_sector);
215 unsigned char cmd[16];
216 int result;
217
218 if (!scsi_device_online(sdp))
219 return -ENODEV;
220
> 221 sd_zbc_debug(sdkp, "REPORT ZONES lba %zu len %d\n",
222 start_lba, bufflen);
223
224 memset(cmd, 0, 16);
225 cmd[0] = ZBC_IN;
226 cmd[1] = ZI_REPORT_ZONES;
227 put_unaligned_be64(start_lba, &cmd[2]);
228 put_unaligned_be32(bufflen, &cmd[10]);
229 cmd[14] = (partial ? ZBC_REPORT_ZONE_PARTIAL : 0) | option;
230 memset(buffer, 0, bufflen);
231
232 result = scsi_execute_req(sdp, cmd, DMA_FROM_DEVICE,
233 buffer, bufflen, &sshdr,
234 timeout, SD_MAX_RETRIES, NULL);
235
236 if (result) {
> 237 sd_zbc_err(sdkp,
238 "REPORT ZONES lba %zu failed with %d/%d\n",
239 start_lba, host_byte(result), driver_byte(result));
240 return -EIO;
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 55864 bytes --]
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 8/9] sd: Implement support for ZBC devices
@ 2016-09-20 0:08 ` kbuild test robot
0 siblings, 0 replies; 36+ messages in thread
From: kbuild test robot @ 2016-09-20 0:08 UTC (permalink / raw)
Cc: kbuild-all, linux-scsi, linux-block, martin.petersen, axboe,
hare, shaun.tancheff, Hannes Reinecke, Damien Le Moal
[-- Attachment #1: Type: text/plain, Size: 30989 bytes --]
Hi Hannes,
[auto build test WARNING on linus/master]
[also build test WARNING on v4.8-rc7]
[cannot apply to block/for-next next-20160919]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
[Suggest to use git(>=2.9.0) format-patch --base=<commit> (or --base=auto for convenience) to record what (public, well-known) commit your patch series was built on]
[Check https://git-scm.com/docs/git-format-patch for more information]
url: https://github.com/0day-ci/linux/commits/Damien-Le-Moal/ZBC-Zoned-block-device-support/20160920-062608
config: i386-allmodconfig (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
# save the attached .config to linux build tree
make ARCH=i386
All warnings (new ones prefixed by >>):
In file included from include/linux/kernel.h:13:0,
from include/linux/sched.h:17,
from include/linux/blkdev.h:4,
from drivers/scsi/sd_zbc.c:24:
drivers/scsi/sd_zbc.c: In function 'sd_zbc_report_zones':
>> drivers/scsi/sd_zbc.c:61:11: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
pr_debug("%s %s [%s]: " fmt, \
^
include/linux/printk.h:260:21: note: in definition of macro 'pr_fmt'
#define pr_fmt(fmt) fmt
^~~
include/linux/printk.h:308:2: note: in expansion of macro 'dynamic_pr_debug'
dynamic_pr_debug(fmt, ##__VA_ARGS__)
^~~~~~~~~~~~~~~~
>> drivers/scsi/sd_zbc.c:61:2: note: in expansion of macro 'pr_debug'
pr_debug("%s %s [%s]: " fmt, \
^~~~~~~~
>> drivers/scsi/sd_zbc.c:221:2: note: in expansion of macro 'sd_zbc_debug'
sd_zbc_debug(sdkp, "REPORT ZONES lba %zu len %d\n",
^~~~~~~~~~~~
In file included from include/linux/printk.h:6:0,
from include/linux/kernel.h:13,
from include/linux/sched.h:17,
from include/linux/blkdev.h:4,
from drivers/scsi/sd_zbc.c:24:
>> include/linux/kern_levels.h:4:18: warning: format '%zu' expects argument of type 'size_t', but argument 5 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
#define KERN_SOH "\001" /* ASCII Start Of Header */
^
include/linux/kern_levels.h:10:18: note: in expansion of macro 'KERN_SOH'
#define KERN_ERR KERN_SOH "3" /* error conditions */
^~~~~~~~
include/linux/printk.h:276:9: note: in expansion of macro 'KERN_ERR'
printk(KERN_ERR pr_fmt(fmt), ##__VA_ARGS__)
^~~~~~~~
>> drivers/scsi/sd_zbc.c:73:2: note: in expansion of macro 'pr_err'
pr_err("%s %s [%s]: " fmt, \
^~~~~~
>> drivers/scsi/sd_zbc.c:237:3: note: in expansion of macro 'sd_zbc_err'
sd_zbc_err(sdkp,
^~~~~~~~~~
In file included from include/linux/kernel.h:13:0,
from include/linux/sched.h:17,
from include/linux/blkdev.h:4,
from drivers/scsi/sd_zbc.c:24:
drivers/scsi/sd_zbc.c: In function 'sd_zbc_setup_reset_cmnd':
>> drivers/scsi/sd_zbc.c:61:11: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
pr_debug("%s %s [%s]: " fmt, \
^
include/linux/printk.h:260:21: note: in definition of macro 'pr_fmt'
#define pr_fmt(fmt) fmt
^~~
include/linux/printk.h:308:2: note: in expansion of macro 'dynamic_pr_debug'
dynamic_pr_debug(fmt, ##__VA_ARGS__)
^~~~~~~~~~~~~~~~
>> drivers/scsi/sd_zbc.c:61:2: note: in expansion of macro 'pr_debug'
pr_debug("%s %s [%s]: " fmt, \
^~~~~~~~
drivers/scsi/sd_zbc.c:489:4: note: in expansion of macro 'sd_zbc_debug'
sd_zbc_debug(sdkp,
^~~~~~~~~~~~
In file included from drivers/scsi/sd_zbc.c:37:0:
drivers/scsi/sd_zbc.c:517:7: warning: format '%zu' expects argument of type 'size_t', but argument 5 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned reset wp request, start %zu/%zu"
^
drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
(sdsk)->disk->disk_name, fmt, ##a) : \
^~~
drivers/scsi/sd_zbc.c:517:7: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned reset wp request, start %zu/%zu"
^
drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
(sdsk)->disk->disk_name, fmt, ##a) : \
^~~
drivers/scsi/sd_zbc.c:517:7: warning: format '%zu' expects argument of type 'size_t', but argument 7 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned reset wp request, start %zu/%zu"
^
drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
(sdsk)->disk->disk_name, fmt, ##a) : \
^~~
drivers/scsi/sd_zbc.c:517:7: warning: format '%zu' expects argument of type 'size_t', but argument 8 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned reset wp request, start %zu/%zu"
^
drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
(sdsk)->disk->disk_name, fmt, ##a) : \
^~~
In file included from include/scsi/scsi_cmnd.h:10:0,
from drivers/scsi/sd_zbc.c:30:
drivers/scsi/sd_zbc.c:517:7: warning: format '%zu' expects argument of type 'size_t', but argument 5 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned reset wp request, start %zu/%zu"
^
include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
^~~
>> drivers/scsi/sd_zbc.c:516:4: note: in expansion of macro 'sd_printk'
sd_printk(KERN_ERR, sdkp,
^~~~~~~~~
drivers/scsi/sd_zbc.c:517:7: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned reset wp request, start %zu/%zu"
^
include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
^~~
>> drivers/scsi/sd_zbc.c:516:4: note: in expansion of macro 'sd_printk'
sd_printk(KERN_ERR, sdkp,
^~~~~~~~~
drivers/scsi/sd_zbc.c:517:7: warning: format '%zu' expects argument of type 'size_t', but argument 7 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned reset wp request, start %zu/%zu"
^
include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
^~~
>> drivers/scsi/sd_zbc.c:516:4: note: in expansion of macro 'sd_printk'
sd_printk(KERN_ERR, sdkp,
^~~~~~~~~
drivers/scsi/sd_zbc.c:517:7: warning: format '%zu' expects argument of type 'size_t', but argument 8 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned reset wp request, start %zu/%zu"
^
include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
^~~
>> drivers/scsi/sd_zbc.c:516:4: note: in expansion of macro 'sd_printk'
sd_printk(KERN_ERR, sdkp,
^~~~~~~~~
In file included from include/linux/kernel.h:13:0,
from include/linux/sched.h:17,
from include/linux/blkdev.h:4,
from drivers/scsi/sd_zbc.c:24:
drivers/scsi/sd_zbc.c: In function 'sd_zbc_setup_open_cmnd':
>> drivers/scsi/sd_zbc.c:61:11: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
pr_debug("%s %s [%s]: " fmt, \
^
include/linux/printk.h:260:21: note: in definition of macro 'pr_fmt'
#define pr_fmt(fmt) fmt
^~~
include/linux/printk.h:308:2: note: in expansion of macro 'dynamic_pr_debug'
dynamic_pr_debug(fmt, ##__VA_ARGS__)
^~~~~~~~~~~~~~~~
>> drivers/scsi/sd_zbc.c:61:2: note: in expansion of macro 'pr_debug'
pr_debug("%s %s [%s]: " fmt, \
^~~~~~~~
drivers/scsi/sd_zbc.c:573:4: note: in expansion of macro 'sd_zbc_debug'
sd_zbc_debug(sdkp,
^~~~~~~~~~~~
In file included from drivers/scsi/sd_zbc.c:37:0:
drivers/scsi/sd_zbc.c:594:7: warning: format '%zu' expects argument of type 'size_t', but argument 5 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned open zone request, start %zu/%zu"
^
drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
(sdsk)->disk->disk_name, fmt, ##a) : \
^~~
drivers/scsi/sd_zbc.c:594:7: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned open zone request, start %zu/%zu"
^
drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
(sdsk)->disk->disk_name, fmt, ##a) : \
^~~
drivers/scsi/sd_zbc.c:594:7: warning: format '%zu' expects argument of type 'size_t', but argument 7 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned open zone request, start %zu/%zu"
^
drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
(sdsk)->disk->disk_name, fmt, ##a) : \
^~~
drivers/scsi/sd_zbc.c:594:7: warning: format '%zu' expects argument of type 'size_t', but argument 8 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned open zone request, start %zu/%zu"
^
drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
(sdsk)->disk->disk_name, fmt, ##a) : \
^~~
In file included from include/scsi/scsi_cmnd.h:10:0,
from drivers/scsi/sd_zbc.c:30:
drivers/scsi/sd_zbc.c:594:7: warning: format '%zu' expects argument of type 'size_t', but argument 5 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned open zone request, start %zu/%zu"
^
include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
^~~
drivers/scsi/sd_zbc.c:593:4: note: in expansion of macro 'sd_printk'
sd_printk(KERN_ERR, sdkp,
^~~~~~~~~
drivers/scsi/sd_zbc.c:594:7: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned open zone request, start %zu/%zu"
^
include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
^~~
drivers/scsi/sd_zbc.c:593:4: note: in expansion of macro 'sd_printk'
sd_printk(KERN_ERR, sdkp,
^~~~~~~~~
drivers/scsi/sd_zbc.c:594:7: warning: format '%zu' expects argument of type 'size_t', but argument 7 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned open zone request, start %zu/%zu"
^
include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
^~~
drivers/scsi/sd_zbc.c:593:4: note: in expansion of macro 'sd_printk'
sd_printk(KERN_ERR, sdkp,
^~~~~~~~~
drivers/scsi/sd_zbc.c:594:7: warning: format '%zu' expects argument of type 'size_t', but argument 8 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned open zone request, start %zu/%zu"
^
include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
^~~
drivers/scsi/sd_zbc.c:593:4: note: in expansion of macro 'sd_printk'
sd_printk(KERN_ERR, sdkp,
^~~~~~~~~
In file included from include/linux/kernel.h:13:0,
from include/linux/sched.h:17,
from include/linux/blkdev.h:4,
from drivers/scsi/sd_zbc.c:24:
drivers/scsi/sd_zbc.c: In function 'sd_zbc_setup_close_cmnd':
>> drivers/scsi/sd_zbc.c:61:11: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
pr_debug("%s %s [%s]: " fmt, \
^
include/linux/printk.h:260:21: note: in definition of macro 'pr_fmt'
#define pr_fmt(fmt) fmt
^~~
include/linux/printk.h:308:2: note: in expansion of macro 'dynamic_pr_debug'
dynamic_pr_debug(fmt, ##__VA_ARGS__)
^~~~~~~~~~~~~~~~
>> drivers/scsi/sd_zbc.c:61:2: note: in expansion of macro 'pr_debug'
pr_debug("%s %s [%s]: " fmt, \
^~~~~~~~
drivers/scsi/sd_zbc.c:645:4: note: in expansion of macro 'sd_zbc_debug'
sd_zbc_debug(sdkp,
^~~~~~~~~~~~
In file included from drivers/scsi/sd_zbc.c:37:0:
drivers/scsi/sd_zbc.c:666:7: warning: format '%zu' expects argument of type 'size_t', but argument 5 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned close zone request, start %zu/%zu"
^
drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
(sdsk)->disk->disk_name, fmt, ##a) : \
^~~
drivers/scsi/sd_zbc.c:666:7: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned close zone request, start %zu/%zu"
^
drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
(sdsk)->disk->disk_name, fmt, ##a) : \
^~~
drivers/scsi/sd_zbc.c:666:7: warning: format '%zu' expects argument of type 'size_t', but argument 7 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned close zone request, start %zu/%zu"
^
drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
(sdsk)->disk->disk_name, fmt, ##a) : \
^~~
drivers/scsi/sd_zbc.c:666:7: warning: format '%zu' expects argument of type 'size_t', but argument 8 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned close zone request, start %zu/%zu"
^
drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
(sdsk)->disk->disk_name, fmt, ##a) : \
^~~
In file included from include/scsi/scsi_cmnd.h:10:0,
from drivers/scsi/sd_zbc.c:30:
drivers/scsi/sd_zbc.c:666:7: warning: format '%zu' expects argument of type 'size_t', but argument 5 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned close zone request, start %zu/%zu"
^
include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
^~~
drivers/scsi/sd_zbc.c:665:4: note: in expansion of macro 'sd_printk'
sd_printk(KERN_ERR, sdkp,
^~~~~~~~~
drivers/scsi/sd_zbc.c:666:7: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned close zone request, start %zu/%zu"
^
include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
^~~
drivers/scsi/sd_zbc.c:665:4: note: in expansion of macro 'sd_printk'
sd_printk(KERN_ERR, sdkp,
^~~~~~~~~
drivers/scsi/sd_zbc.c:666:7: warning: format '%zu' expects argument of type 'size_t', but argument 7 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned close zone request, start %zu/%zu"
^
include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
^~~
drivers/scsi/sd_zbc.c:665:4: note: in expansion of macro 'sd_printk'
sd_printk(KERN_ERR, sdkp,
^~~~~~~~~
drivers/scsi/sd_zbc.c:666:7: warning: format '%zu' expects argument of type 'size_t', but argument 8 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned close zone request, start %zu/%zu"
^
include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
^~~
drivers/scsi/sd_zbc.c:665:4: note: in expansion of macro 'sd_printk'
sd_printk(KERN_ERR, sdkp,
^~~~~~~~~
In file included from include/linux/kernel.h:13:0,
from include/linux/sched.h:17,
from include/linux/blkdev.h:4,
from drivers/scsi/sd_zbc.c:24:
drivers/scsi/sd_zbc.c: In function 'sd_zbc_setup_finish_cmnd':
>> drivers/scsi/sd_zbc.c:61:11: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
pr_debug("%s %s [%s]: " fmt, \
^
include/linux/printk.h:260:21: note: in definition of macro 'pr_fmt'
#define pr_fmt(fmt) fmt
^~~
include/linux/printk.h:308:2: note: in expansion of macro 'dynamic_pr_debug'
dynamic_pr_debug(fmt, ##__VA_ARGS__)
^~~~~~~~~~~~~~~~
>> drivers/scsi/sd_zbc.c:61:2: note: in expansion of macro 'pr_debug'
pr_debug("%s %s [%s]: " fmt, \
^~~~~~~~
drivers/scsi/sd_zbc.c:717:4: note: in expansion of macro 'sd_zbc_debug'
sd_zbc_debug(sdkp,
^~~~~~~~~~~~
In file included from drivers/scsi/sd_zbc.c:37:0:
drivers/scsi/sd_zbc.c:734:7: warning: format '%zu' expects argument of type 'size_t', but argument 5 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned finish zone request, start %zu/%zu"
^
drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
(sdsk)->disk->disk_name, fmt, ##a) : \
^~~
drivers/scsi/sd_zbc.c:734:7: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned finish zone request, start %zu/%zu"
^
drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
(sdsk)->disk->disk_name, fmt, ##a) : \
^~~
drivers/scsi/sd_zbc.c:734:7: warning: format '%zu' expects argument of type 'size_t', but argument 7 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned finish zone request, start %zu/%zu"
^
drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
(sdsk)->disk->disk_name, fmt, ##a) : \
^~~
drivers/scsi/sd_zbc.c:734:7: warning: format '%zu' expects argument of type 'size_t', but argument 8 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned finish zone request, start %zu/%zu"
^
drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
(sdsk)->disk->disk_name, fmt, ##a) : \
^~~
In file included from include/scsi/scsi_cmnd.h:10:0,
from drivers/scsi/sd_zbc.c:30:
drivers/scsi/sd_zbc.c:734:7: warning: format '%zu' expects argument of type 'size_t', but argument 5 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned finish zone request, start %zu/%zu"
^
include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
^~~
drivers/scsi/sd_zbc.c:733:4: note: in expansion of macro 'sd_printk'
sd_printk(KERN_ERR, sdkp,
^~~~~~~~~
drivers/scsi/sd_zbc.c:734:7: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned finish zone request, start %zu/%zu"
^
include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
^~~
drivers/scsi/sd_zbc.c:733:4: note: in expansion of macro 'sd_printk'
sd_printk(KERN_ERR, sdkp,
^~~~~~~~~
drivers/scsi/sd_zbc.c:734:7: warning: format '%zu' expects argument of type 'size_t', but argument 7 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned finish zone request, start %zu/%zu"
^
include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
^~~
drivers/scsi/sd_zbc.c:733:4: note: in expansion of macro 'sd_printk'
sd_printk(KERN_ERR, sdkp,
^~~~~~~~~
drivers/scsi/sd_zbc.c:734:7: warning: format '%zu' expects argument of type 'size_t', but argument 8 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Unaligned finish zone request, start %zu/%zu"
^
include/scsi/scsi_device.h:233:36: note: in definition of macro 'sdev_printk'
sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
^~~
drivers/scsi/sd_zbc.c:733:4: note: in expansion of macro 'sd_printk'
sd_printk(KERN_ERR, sdkp,
^~~~~~~~~
In file included from include/linux/kernel.h:13:0,
from include/linux/sched.h:17,
from include/linux/blkdev.h:4,
from drivers/scsi/sd_zbc.c:24:
drivers/scsi/sd_zbc.c: In function 'sd_zbc_setup_read_write':
>> drivers/scsi/sd_zbc.c:61:11: warning: format '%zu' expects argument of type 'size_t', but argument 6 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
pr_debug("%s %s [%s]: " fmt, \
^
include/linux/printk.h:260:21: note: in definition of macro 'pr_fmt'
#define pr_fmt(fmt) fmt
^~~
include/linux/printk.h:308:2: note: in expansion of macro 'dynamic_pr_debug'
dynamic_pr_debug(fmt, ##__VA_ARGS__)
^~~~~~~~~~~~~~~~
>> drivers/scsi/sd_zbc.c:61:2: note: in expansion of macro 'pr_debug'
pr_debug("%s %s [%s]: " fmt, \
^~~~~~~~
drivers/scsi/sd_zbc.c:783:3: note: in expansion of macro 'sd_zbc_debug'
sd_zbc_debug(sdkp,
^~~~~~~~~~~~
In file included from drivers/scsi/sd_zbc.c:37:0:
drivers/scsi/sd_zbc.c: In function 'sd_zbc_read_zones':
drivers/scsi/sd_zbc.c:985:8: warning: format '%zu' expects argument of type 'size_t', but argument 5 has type 'sector_t {aka long long unsigned int}' [-Wformat=]
"Changing capacity from %zu "
^
drivers/scsi/sd.h:116:31: note: in definition of macro 'sd_printk'
(sdsk)->disk->disk_name, fmt, ##a) : \
^~~
vim +61 drivers/scsi/sd_zbc.c
18 * along with this program; see the file COPYING. If not, write to
19 * the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139,
20 * USA.
21 *
22 */
23
> 24 #include <linux/blkdev.h>
25 #include <linux/rbtree.h>
26
27 #include <asm/unaligned.h>
28
29 #include <scsi/scsi.h>
> 30 #include <scsi/scsi_cmnd.h>
31 #include <scsi/scsi_dbg.h>
32 #include <scsi/scsi_device.h>
33 #include <scsi/scsi_driver.h>
34 #include <scsi/scsi_host.h>
35 #include <scsi/scsi_eh.h>
36
> 37 #include "sd.h"
38 #include "scsi_priv.h"
39
40 enum zbc_zone_type {
41 ZBC_ZONE_TYPE_CONV = 0x1,
42 ZBC_ZONE_TYPE_SEQWRITE_REQ,
43 ZBC_ZONE_TYPE_SEQWRITE_PREF,
44 ZBC_ZONE_TYPE_RESERVED,
45 };
46
47 enum zbc_zone_cond {
48 ZBC_ZONE_COND_NO_WP,
49 ZBC_ZONE_COND_EMPTY,
50 ZBC_ZONE_COND_IMP_OPEN,
51 ZBC_ZONE_COND_EXP_OPEN,
52 ZBC_ZONE_COND_CLOSED,
53 ZBC_ZONE_COND_READONLY = 0xd,
54 ZBC_ZONE_COND_FULL,
55 ZBC_ZONE_COND_OFFLINE,
56 };
57
58 #define SD_ZBC_BUF_SIZE 131072
59
60 #define sd_zbc_debug(sdkp, fmt, args...) \
> 61 pr_debug("%s %s [%s]: " fmt, \
62 dev_driver_string(&(sdkp)->device->sdev_gendev), \
63 dev_name(&(sdkp)->device->sdev_gendev), \
64 (sdkp)->disk->disk_name, ## args)
65
66 #define sd_zbc_debug_ratelimit(sdkp, fmt, args...) \
67 do { \
68 if (printk_ratelimit()) \
69 sd_zbc_debug(sdkp, fmt, ## args); \
70 } while( 0 )
71
72 #define sd_zbc_err(sdkp, fmt, args...) \
> 73 pr_err("%s %s [%s]: " fmt, \
74 dev_driver_string(&(sdkp)->device->sdev_gendev), \
75 dev_name(&(sdkp)->device->sdev_gendev), \
76 (sdkp)->disk->disk_name, ## args)
77
78 struct zbc_zone_work {
79 struct work_struct zone_work;
80 struct scsi_disk *sdkp;
81 sector_t sector;
82 sector_t nr_sects;
83 bool init;
84 unsigned int nr_zones;
85 };
86
87 struct blk_zone *zbc_desc_to_zone(struct scsi_disk *sdkp, unsigned char *rec)
88 {
89 struct blk_zone *zone;
90
91 zone = kzalloc(sizeof(struct blk_zone), GFP_KERNEL);
92 if (!zone)
93 return NULL;
94
95 /* Zone type */
96 switch(rec[0] & 0x0f) {
97 case ZBC_ZONE_TYPE_CONV:
98 case ZBC_ZONE_TYPE_SEQWRITE_REQ:
99 case ZBC_ZONE_TYPE_SEQWRITE_PREF:
100 zone->type = rec[0] & 0x0f;
101 break;
102 default:
103 zone->type = BLK_ZONE_TYPE_UNKNOWN;
104 break;
105 }
106
107 /* Zone condition */
108 zone->cond = (rec[1] >> 4) & 0xf;
109 if (rec[1] & 0x01)
110 zone->reset = 1;
111 if (rec[1] & 0x02)
112 zone->non_seq = 1;
113
114 /* Zone start sector and length */
115 zone->len = logical_to_sectors(sdkp->device,
116 get_unaligned_be64(&rec[8]));
117 zone->start = logical_to_sectors(sdkp->device,
118 get_unaligned_be64(&rec[16]));
119
120 /* Zone write pointer */
121 if (blk_zone_is_empty(zone) &&
122 zone->wp != zone->start)
123 zone->wp = zone->start;
124 else if (blk_zone_is_full(zone))
125 zone->wp = zone->start + zone->len;
126 else if (blk_zone_is_seq(zone))
127 zone->wp = logical_to_sectors(sdkp->device,
128 get_unaligned_be64(&rec[24]));
129 else
130 zone->wp = (sector_t)-1;
131
132 return zone;
133 }
134
135 static int zbc_parse_zones(struct scsi_disk *sdkp, unsigned char *buf,
136 unsigned int buf_len, sector_t *next_sector)
137 {
138 struct request_queue *q = sdkp->disk->queue;
139 sector_t capacity = logical_to_sectors(sdkp->device, sdkp->capacity);
140 unsigned char *rec = buf;
141 unsigned int zone_len, list_length;
142
143 /* Parse REPORT ZONES header */
144 list_length = get_unaligned_be32(&buf[0]);
145 rec = buf + 64;
146 list_length += 64;
147
148 if (list_length < buf_len)
149 buf_len = list_length;
150
151 /* Parse REPORT ZONES zone descriptors */
152 *next_sector = capacity;
153 while (rec < buf + buf_len) {
154
155 struct blk_zone *new, *old;
156
157 new = zbc_desc_to_zone(sdkp, rec);
158 if (!new)
159 return -ENOMEM;
160
161 zone_len = new->len;
162 *next_sector = new->start + zone_len;
163
164 old = blk_insert_zone(q, new);
165 if (old) {
166 blk_lock_zone(old);
167
168 /*
169 * Always update the zone state flags and the zone
170 * offline and read-only condition as the drive may
171 * change those independently of the commands being
172 * executed
173 */
174 old->reset = new->reset;
175 old->non_seq = new->non_seq;
176 if (blk_zone_is_offline(new) ||
177 blk_zone_is_readonly(new))
178 old->cond = new->cond;
179
180 if (blk_zone_in_update(old)) {
181 old->cond = new->cond;
182 old->wp = new->wp;
183 blk_clear_zone_update(old);
184 }
185
186 blk_unlock_zone(old);
187
188 kfree(new);
189 }
190
191 rec += 64;
192
193 }
194
195 return 0;
196 }
197
198 /**
199 * sd_zbc_report_zones - Issue a REPORT ZONES scsi command
200 * @sdkp: SCSI disk to which the command should be send
201 * @buffer: response buffer
202 * @bufflen: length of @buffer
203 * @start_sector: logical sector for the zone information should be reported
204 * @option: reporting option to be used
205 * @partial: flag to set the 'partial' bit for report zones command
206 */
207 int sd_zbc_report_zones(struct scsi_disk *sdkp, unsigned char *buffer,
208 int bufflen, sector_t start_sector,
209 enum zbc_zone_reporting_options option, bool partial)
210 {
211 struct scsi_device *sdp = sdkp->device;
212 const int timeout = sdp->request_queue->rq_timeout;
213 struct scsi_sense_hdr sshdr;
214 sector_t start_lba = sectors_to_logical(sdkp->device, start_sector);
215 unsigned char cmd[16];
216 int result;
217
218 if (!scsi_device_online(sdp))
219 return -ENODEV;
220
> 221 sd_zbc_debug(sdkp, "REPORT ZONES lba %zu len %d\n",
222 start_lba, bufflen);
223
224 memset(cmd, 0, 16);
225 cmd[0] = ZBC_IN;
226 cmd[1] = ZI_REPORT_ZONES;
227 put_unaligned_be64(start_lba, &cmd[2]);
228 put_unaligned_be32(bufflen, &cmd[10]);
229 cmd[14] = (partial ? ZBC_REPORT_ZONE_PARTIAL : 0) | option;
230 memset(buffer, 0, bufflen);
231
232 result = scsi_execute_req(sdp, cmd, DMA_FROM_DEVICE,
233 buffer, bufflen, &sshdr,
234 timeout, SD_MAX_RETRIES, NULL);
235
236 if (result) {
> 237 sd_zbc_err(sdkp,
238 "REPORT ZONES lba %zu failed with %d/%d\n",
239 start_lba, host_byte(result), driver_byte(result));
240 return -EIO;
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 55864 bytes --]
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 9/9] blk-zoned: Add ioctl interface for zone operations
2016-09-19 21:27 ` Damien Le Moal
@ 2016-09-20 2:39 ` kbuild test robot
-1 siblings, 0 replies; 36+ messages in thread
From: kbuild test robot @ 2016-09-20 2:39 UTC (permalink / raw)
To: Damien Le Moal
Cc: kbuild-all, linux-scsi, linux-block, martin.petersen, axboe,
hare, shaun.tancheff, Damien Le Moal
[-- Attachment #1: Type: text/plain, Size: 3299 bytes --]
Hi Shaun,
[auto build test ERROR on linus/master]
[also build test ERROR on v4.8-rc7]
[cannot apply to block/for-next next-20160919]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
[Suggest to use git(>=2.9.0) format-patch --base=<commit> (or --base=auto for convenience) to record what (public, well-known) commit your patch series was built on]
[Check https://git-scm.com/docs/git-format-patch for more information]
url: https://github.com/0day-ci/linux/commits/Damien-Le-Moal/ZBC-Zoned-block-device-support/20160920-062608
config: blackfin-allyesconfig (attached as .config)
compiler: bfin-uclinux-gcc (GCC) 6.2.0
reproduce:
wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=blackfin
All error/warnings (new ones prefixed by >>):
In file included from include/linux/linkage.h:4:0,
from include/linux/kernel.h:6,
from block/blk-zoned.c:11:
In function 'blkdev_zone_action_ioctl',
inlined from 'blkdev_zone_ioctl' at block/blk-zoned.c:445:7:
>> include/linux/compiler.h:491:38: error: call to '__compiletime_assert_382' declared with attribute error: BUILD_BUG_ON failed: ptr_size >= 8
_compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)
^
include/linux/compiler.h:474:4: note: in definition of macro '__compiletime_assert'
prefix ## suffix(); \
^~~~~~
include/linux/compiler.h:491:2: note: in expansion of macro '_compiletime_assert'
_compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)
^~~~~~~~~~~~~~~~~~~
include/linux/bug.h:51:37: note: in expansion of macro 'compiletime_assert'
#define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
^~~~~~~~~~~~~~~~~~
include/linux/bug.h:75:2: note: in expansion of macro 'BUILD_BUG_ON_MSG'
BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition)
^~~~~~~~~~~~~~~~
>> arch/blackfin/include/asm/uaccess.h:136:3: note: in expansion of macro 'BUILD_BUG_ON'
BUILD_BUG_ON(ptr_size >= 8); \
^~~~~~~~~~~~
>> block/blk-zoned.c:382:6: note: in expansion of macro 'get_user'
if (get_user(sector, (u64 __user *)argp))
^~~~~~~~
vim +/get_user +382 block/blk-zoned.c
366 z.reset = zone->reset;
367
368 blk_unlock_zone(zone);
369
370 if (copy_to_user(argp, &z, sizeof(struct blkzone)))
371 return -EFAULT;
372
373 return 0;
374 }
375
376 static int blkdev_zone_action_ioctl(struct block_device *bdev,
377 unsigned cmd, void __user *argp)
378 {
379 unsigned int op;
380 u64 sector;
381
> 382 if (get_user(sector, (u64 __user *)argp))
383 return -EFAULT;
384
385 switch (cmd) {
386 case BLKRESETZONE:
387 op = REQ_OP_ZONE_RESET;
388 break;
389 case BLKOPENZONE:
390 op = REQ_OP_ZONE_OPEN;
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 41390 bytes --]
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 9/9] blk-zoned: Add ioctl interface for zone operations
@ 2016-09-20 2:39 ` kbuild test robot
0 siblings, 0 replies; 36+ messages in thread
From: kbuild test robot @ 2016-09-20 2:39 UTC (permalink / raw)
Cc: kbuild-all, linux-scsi, linux-block, martin.petersen, axboe,
hare, shaun.tancheff, Damien Le Moal
[-- Attachment #1: Type: text/plain, Size: 3299 bytes --]
Hi Shaun,
[auto build test ERROR on linus/master]
[also build test ERROR on v4.8-rc7]
[cannot apply to block/for-next next-20160919]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
[Suggest to use git(>=2.9.0) format-patch --base=<commit> (or --base=auto for convenience) to record what (public, well-known) commit your patch series was built on]
[Check https://git-scm.com/docs/git-format-patch for more information]
url: https://github.com/0day-ci/linux/commits/Damien-Le-Moal/ZBC-Zoned-block-device-support/20160920-062608
config: blackfin-allyesconfig (attached as .config)
compiler: bfin-uclinux-gcc (GCC) 6.2.0
reproduce:
wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=blackfin
All error/warnings (new ones prefixed by >>):
In file included from include/linux/linkage.h:4:0,
from include/linux/kernel.h:6,
from block/blk-zoned.c:11:
In function 'blkdev_zone_action_ioctl',
inlined from 'blkdev_zone_ioctl' at block/blk-zoned.c:445:7:
>> include/linux/compiler.h:491:38: error: call to '__compiletime_assert_382' declared with attribute error: BUILD_BUG_ON failed: ptr_size >= 8
_compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)
^
include/linux/compiler.h:474:4: note: in definition of macro '__compiletime_assert'
prefix ## suffix(); \
^~~~~~
include/linux/compiler.h:491:2: note: in expansion of macro '_compiletime_assert'
_compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)
^~~~~~~~~~~~~~~~~~~
include/linux/bug.h:51:37: note: in expansion of macro 'compiletime_assert'
#define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
^~~~~~~~~~~~~~~~~~
include/linux/bug.h:75:2: note: in expansion of macro 'BUILD_BUG_ON_MSG'
BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition)
^~~~~~~~~~~~~~~~
>> arch/blackfin/include/asm/uaccess.h:136:3: note: in expansion of macro 'BUILD_BUG_ON'
BUILD_BUG_ON(ptr_size >= 8); \
^~~~~~~~~~~~
>> block/blk-zoned.c:382:6: note: in expansion of macro 'get_user'
if (get_user(sector, (u64 __user *)argp))
^~~~~~~~
vim +/get_user +382 block/blk-zoned.c
366 z.reset = zone->reset;
367
368 blk_unlock_zone(zone);
369
370 if (copy_to_user(argp, &z, sizeof(struct blkzone)))
371 return -EFAULT;
372
373 return 0;
374 }
375
376 static int blkdev_zone_action_ioctl(struct block_device *bdev,
377 unsigned cmd, void __user *argp)
378 {
379 unsigned int op;
380 u64 sector;
381
> 382 if (get_user(sector, (u64 __user *)argp))
383 return -EFAULT;
384
385 switch (cmd) {
386 case BLKRESETZONE:
387 op = REQ_OP_ZONE_RESET;
388 break;
389 case BLKOPENZONE:
390 op = REQ_OP_ZONE_OPEN;
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 41390 bytes --]
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 1/9] block: Add 'zoned' queue limit
2016-09-19 21:27 ` Damien Le Moal
@ 2016-09-20 4:05 ` Bart Van Assche
-1 siblings, 0 replies; 36+ messages in thread
From: Bart Van Assche @ 2016-09-20 4:05 UTC (permalink / raw)
To: Damien Le Moal, linux-scsi, linux-block
Cc: martin.petersen, axboe, hare, shaun.tancheff
On 09/19/16 14:27, Damien Le Moal wrote:
> +/*
> + * Zoned block device models (zoned limit).
> + */
> +enum blk_zoned_model {
> + BLK_ZONED_NONE, /* Regular block device */
> + BLK_ZONED_HA, /* Host-aware zoned block device */
> + BLK_ZONED_HM, /* Host-managed zoned block device */
> +};
[ ... ]
> +static inline unsigned int blk_queue_zoned(struct request_queue *q)
> +{
> + return q->limits.zoned;
> +}
> +
> /*
> * We regard a request as sync, if either a read or a sync write
> */
> @@ -1354,6 +1369,16 @@ static inline unsigned int bdev_write_same(struct block_device *bdev)
> return 0;
> }
>
> +static inline unsigned int bdev_zoned(struct block_device *bdev)
> +{
> + struct request_queue *q = bdev_get_queue(bdev);
> +
> + if (q)
> + return blk_queue_zoned(q);
> +
> + return 0;
> +}
Hello Damien,
Please consider changing the return type of the above two functions into
"enum blk_zoned_model" to make it clear that both return one of the
BLK_ZONED_* constants.
Thanks,
Bart.
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 1/9] block: Add 'zoned' queue limit
@ 2016-09-20 4:05 ` Bart Van Assche
0 siblings, 0 replies; 36+ messages in thread
From: Bart Van Assche @ 2016-09-20 4:05 UTC (permalink / raw)
To: Damien Le Moal, linux-scsi, linux-block
Cc: martin.petersen, axboe, hare, shaun.tancheff
On 09/19/16 14:27, Damien Le Moal wrote:
> +/*
> + * Zoned block device models (zoned limit).
> + */
> +enum blk_zoned_model {
> + BLK_ZONED_NONE, /* Regular block device */
> + BLK_ZONED_HA, /* Host-aware zoned block device */
> + BLK_ZONED_HM, /* Host-managed zoned block device */
> +};
[ ... ]
> +static inline unsigned int blk_queue_zoned(struct request_queue *q)
> +{
> + return q->limits.zoned;
> +}
> +
> /*
> * We regard a request as sync, if either a read or a sync write
> */
> @@ -1354,6 +1369,16 @@ static inline unsigned int bdev_write_same(struct block_device *bdev)
> return 0;
> }
>
> +static inline unsigned int bdev_zoned(struct block_device *bdev)
> +{
> + struct request_queue *q = bdev_get_queue(bdev);
> +
> + if (q)
> + return blk_queue_zoned(q);
> +
> + return 0;
> +}
Hello Damien,
Please consider changing the return type of the above two functions into
"enum blk_zoned_model" to make it clear that both return one of the
BLK_ZONED_* constants.
Thanks,
Bart.
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 4/9] block: Define zoned block device operations
2016-09-19 21:27 ` Damien Le Moal
@ 2016-09-20 4:05 ` Bart Van Assche
-1 siblings, 0 replies; 36+ messages in thread
From: Bart Van Assche @ 2016-09-20 4:05 UTC (permalink / raw)
To: Damien Le Moal, linux-scsi, linux-block
Cc: martin.petersen, axboe, hare, shaun.tancheff
On 09/19/16 14:27, Damien Le Moal wrote:
> diff --git a/block/blk-core.c b/block/blk-core.c
> index 36c7ac3..4a7f7ba 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -1941,6 +1941,13 @@ generic_make_request_checks(struct bio *bio)
> case REQ_OP_WRITE_SAME:
> if (!bdev_write_same(bio->bi_bdev))
> goto not_supported;
> + case REQ_OP_ZONE_REPORT:
> + case REQ_OP_ZONE_RESET:
> + case REQ_OP_ZONE_OPEN:
> + case REQ_OP_ZONE_CLOSE:
> + case REQ_OP_ZONE_FINISH:
> + if (!bdev_zoned(bio->bi_bdev))
> + goto not_supported;
In patch 1/9 the BLK_ZONED_* constants have been introduced. The above
code compares the bdev_zoned() return value against 0. That means that
the above code will break if the numeric value of the BLK_ZONED_*
constants would be changed. Please change the above code such that it
compares the bdev_zoned() return value against a BLK_ZONED_* constant.
> + * Operation:
> + * REQ_OP_ZONE_REPORT: Request information for all zones or for a single zone.
> + * REQ_OP_ZONE_RESET: Reset the write pointer of all zones or of a single zone.
> + * REQ_OP_ZONE_OPEN: Explicitely open the maximum allowed number of zones or
> + * a single zone. For the former case, the zones that will
> + * actually be open are chosen by the disk.
> + * REQ_OP_ZONE_CLOSE: Close all implicitely or explicitely open zones or
> + * a single zone.
> + * REQ_OP_ZONE_FINISH: Transition one or all open and closed zones to the full
> + * condition.
Please change *plicitely into *plicitly.
Thanks,
Bart.
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 4/9] block: Define zoned block device operations
@ 2016-09-20 4:05 ` Bart Van Assche
0 siblings, 0 replies; 36+ messages in thread
From: Bart Van Assche @ 2016-09-20 4:05 UTC (permalink / raw)
To: Damien Le Moal, linux-scsi, linux-block
Cc: martin.petersen, axboe, hare, shaun.tancheff
On 09/19/16 14:27, Damien Le Moal wrote:
> diff --git a/block/blk-core.c b/block/blk-core.c
> index 36c7ac3..4a7f7ba 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -1941,6 +1941,13 @@ generic_make_request_checks(struct bio *bio)
> case REQ_OP_WRITE_SAME:
> if (!bdev_write_same(bio->bi_bdev))
> goto not_supported;
> + case REQ_OP_ZONE_REPORT:
> + case REQ_OP_ZONE_RESET:
> + case REQ_OP_ZONE_OPEN:
> + case REQ_OP_ZONE_CLOSE:
> + case REQ_OP_ZONE_FINISH:
> + if (!bdev_zoned(bio->bi_bdev))
> + goto not_supported;
In patch 1/9 the BLK_ZONED_* constants have been introduced. The above
code compares the bdev_zoned() return value against 0. That means that
the above code will break if the numeric value of the BLK_ZONED_*
constants would be changed. Please change the above code such that it
compares the bdev_zoned() return value against a BLK_ZONED_* constant.
> + * Operation:
> + * REQ_OP_ZONE_REPORT: Request information for all zones or for a single zone.
> + * REQ_OP_ZONE_RESET: Reset the write pointer of all zones or of a single zone.
> + * REQ_OP_ZONE_OPEN: Explicitely open the maximum allowed number of zones or
> + * a single zone. For the former case, the zones that will
> + * actually be open are chosen by the disk.
> + * REQ_OP_ZONE_CLOSE: Close all implicitely or explicitely open zones or
> + * a single zone.
> + * REQ_OP_ZONE_FINISH: Transition one or all open and closed zones to the full
> + * condition.
Please change *plicitely into *plicitly.
Thanks,
Bart.
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 5/9] block: Implement support for zoned block devices
2016-09-19 21:27 ` Damien Le Moal
@ 2016-09-20 4:18 ` Bart Van Assche
-1 siblings, 0 replies; 36+ messages in thread
From: Bart Van Assche @ 2016-09-20 4:18 UTC (permalink / raw)
To: Damien Le Moal, linux-scsi, linux-block
Cc: martin.petersen, axboe, hare, shaun.tancheff
On 09/19/16 14:27, Damien Le Moal wrote:
> + /*
> + * Make sure bi_size does not overflow because
> + * of some weird very large zone size.
> + */
> + if (nr_sects && (unsigned long long)nr_sects << 9 > UINT_MAX)
> + return -EINVAL;
> +
> + bio = bio_alloc(gfp_mask, 1);
> + if (!bio)
> + return -ENOMEM;
> +
> + bio->bi_iter.bi_sector = sector;
> + bio->bi_iter.bi_size = nr_sects << 9;
> + bio->bi_vcnt = 0;
> + bio->bi_bdev = bdev;
> + bio_set_op_attrs(bio, op, 0);
Hello Damien and Hannes,
nr_sects is cast to unsigned long long for the overflow test but not
when assigning bi_size. To me this looks like an inconsistency. Please
make both expressions consistent.
Thanks,
Bart.
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 5/9] block: Implement support for zoned block devices
@ 2016-09-20 4:18 ` Bart Van Assche
0 siblings, 0 replies; 36+ messages in thread
From: Bart Van Assche @ 2016-09-20 4:18 UTC (permalink / raw)
To: Damien Le Moal, linux-scsi, linux-block
Cc: martin.petersen, axboe, hare, shaun.tancheff
On 09/19/16 14:27, Damien Le Moal wrote:
> + /*
> + * Make sure bi_size does not overflow because
> + * of some weird very large zone size.
> + */
> + if (nr_sects && (unsigned long long)nr_sects << 9 > UINT_MAX)
> + return -EINVAL;
> +
> + bio = bio_alloc(gfp_mask, 1);
> + if (!bio)
> + return -ENOMEM;
> +
> + bio->bi_iter.bi_sector = sector;
> + bio->bi_iter.bi_size = nr_sects << 9;
> + bio->bi_vcnt = 0;
> + bio->bi_bdev = bdev;
> + bio_set_op_attrs(bio, op, 0);
Hello Damien and Hannes,
nr_sects is cast to unsigned long long for the overflow test but not
when assigning bi_size. To me this looks like an inconsistency. Please
make both expressions consistent.
Thanks,
Bart.
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 8/9] sd: Implement support for ZBC devices
2016-09-19 21:27 ` Damien Le Moal
@ 2016-09-20 5:40 ` Shaun Tancheff
-1 siblings, 0 replies; 36+ messages in thread
From: Shaun Tancheff @ 2016-09-20 5:40 UTC (permalink / raw)
To: Damien Le Moal
Cc: linux-scsi, linux-block, Martin K. Petersen, Jens Axboe,
Hannes Reinecke, Hannes Reinecke
On Mon, Sep 19, 2016 at 4:27 PM, Damien Le Moal <damien.lemoal@hgst.com> wr=
ote:
> From: Hannes Reinecke <hare@suse.com>
>
> Implement ZBC support functions to setup zoned disks and fill the
> block device zone information tree during the device scan. The
> zone information tree is also always updated on disk revalidation.
> This adds support for the REQ_OP_ZONE* operations and also implements
> the new RESET_WP provisioning mode so that discard requests can be
> mapped to the RESET WRITE POINTER command for devices with a constant
> zone size.
>
> The capacity read of the device triggers the zone information read
> for zoned block devices. As this needs the device zone model, the
> the call to sd_read_capacity is moved after the call to
> sd_read_block_characteristics so that host-aware devices are
> properlly initialized. The call to sd_zbc_read_zones in
> sd_read_capacity may change the device capacity obtained with
> the sd_read_capacity_16 function for devices reporting only the
> capacity of conventional zones at the beginning of the LBA range
> (i.e. devices with rc_basis et to 0).
>
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
> ---
> drivers/scsi/Makefile | 1 +
> drivers/scsi/sd.c | 147 ++++--
> drivers/scsi/sd.h | 68 +++
> drivers/scsi/sd_zbc.c | 1097 +++++++++++++++++++++++++++++++++++++++=
++++++
> include/scsi/scsi_proto.h | 17 +
> 5 files changed, 1304 insertions(+), 26 deletions(-)
> create mode 100644 drivers/scsi/sd_zbc.c
>
> diff --git a/drivers/scsi/Makefile b/drivers/scsi/Makefile
> index d539798..fabcb6d 100644
> --- a/drivers/scsi/Makefile
> +++ b/drivers/scsi/Makefile
> @@ -179,6 +179,7 @@ hv_storvsc-y :=3D storvsc_drv.=
o
>
> sd_mod-objs :=3D sd.o
> sd_mod-$(CONFIG_BLK_DEV_INTEGRITY) +=3D sd_dif.o
> +sd_mod-$(CONFIG_BLK_DEV_ZONED) +=3D sd_zbc.o
>
> sr_mod-objs :=3D sr.o sr_ioctl.o sr_vendor.o
> ncr53c8xx-flags-$(CONFIG_SCSI_ZALON) \
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index d3e852a..46b8b78 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -92,6 +92,7 @@ MODULE_ALIAS_BLOCKDEV_MAJOR(SCSI_DISK15_MAJOR);
> MODULE_ALIAS_SCSI_DEVICE(TYPE_DISK);
> MODULE_ALIAS_SCSI_DEVICE(TYPE_MOD);
> MODULE_ALIAS_SCSI_DEVICE(TYPE_RBC);
> +MODULE_ALIAS_SCSI_DEVICE(TYPE_ZBC);
>
> #if !defined(CONFIG_DEBUG_BLOCK_EXT_DEVT)
> #define SD_MINORS 16
> @@ -99,7 +100,6 @@ MODULE_ALIAS_SCSI_DEVICE(TYPE_RBC);
> #define SD_MINORS 0
> #endif
>
> -static void sd_config_discard(struct scsi_disk *, unsigned int);
> static void sd_config_write_same(struct scsi_disk *);
> static int sd_revalidate_disk(struct gendisk *);
> static void sd_unlock_native_capacity(struct gendisk *disk);
> @@ -162,7 +162,7 @@ cache_type_store(struct device *dev, struct device_at=
tribute *attr,
> static const char temp[] =3D "temporary ";
> int len;
>
> - if (sdp->type !=3D TYPE_DISK)
> + if (sdp->type !=3D TYPE_DISK && sdp->type !=3D TYPE_ZBC)
> /* no cache control on RBC devices; theoretically they
> * can do it, but there's probably so many exceptions
> * it's not worth the risk */
> @@ -261,7 +261,7 @@ allow_restart_store(struct device *dev, struct device=
_attribute *attr,
> if (!capable(CAP_SYS_ADMIN))
> return -EACCES;
>
> - if (sdp->type !=3D TYPE_DISK)
> + if (sdp->type !=3D TYPE_DISK && sdp->type !=3D TYPE_ZBC)
> return -EINVAL;
>
> sdp->allow_restart =3D simple_strtoul(buf, NULL, 10);
> @@ -369,6 +369,7 @@ static const char *lbp_mode[] =3D {
> [SD_LBP_WS16] =3D "writesame_16",
> [SD_LBP_WS10] =3D "writesame_10",
> [SD_LBP_ZERO] =3D "writesame_zero",
> + [SD_ZBC_RESET_WP] =3D "reset_wp",
> [SD_LBP_DISABLE] =3D "disabled",
> };
>
> @@ -391,6 +392,13 @@ provisioning_mode_store(struct device *dev, struct d=
evice_attribute *attr,
> if (!capable(CAP_SYS_ADMIN))
> return -EACCES;
>
> + if (sdkp->zoned =3D=3D 1 || sdp->type =3D=3D TYPE_ZBC) {
> + if (!strncmp(buf, lbp_mode[SD_ZBC_RESET_WP], 20)) {
> + sd_config_discard(sdkp, SD_ZBC_RESET_WP);
> + return count;
> + }
> + return -EINVAL;
> + }
> if (sdp->type !=3D TYPE_DISK)
> return -EINVAL;
>
> @@ -458,7 +466,7 @@ max_write_same_blocks_store(struct device *dev, struc=
t device_attribute *attr,
> if (!capable(CAP_SYS_ADMIN))
> return -EACCES;
>
> - if (sdp->type !=3D TYPE_DISK)
> + if (sdp->type !=3D TYPE_DISK && sdp->type !=3D TYPE_ZBC)
> return -EINVAL;
>
> err =3D kstrtoul(buf, 10, &max);
> @@ -631,7 +639,7 @@ static unsigned char sd_setup_protect_cmnd(struct scs=
i_cmnd *scmd,
> return protect;
> }
>
> -static void sd_config_discard(struct scsi_disk *sdkp, unsigned int mode)
> +void sd_config_discard(struct scsi_disk *sdkp, unsigned int mode)
> {
> struct request_queue *q =3D sdkp->disk->queue;
> unsigned int logical_block_size =3D sdkp->device->sector_size;
> @@ -683,6 +691,11 @@ static void sd_config_discard(struct scsi_disk *sdkp=
, unsigned int mode)
> q->limits.discard_zeroes_data =3D sdkp->lbprz;
> break;
>
> + case SD_ZBC_RESET_WP:
> + max_blocks =3D min_not_zero(sdkp->max_unmap_blocks,
> + (u32)SD_MAX_WS16_BLOCKS);
> + break;
> +
> case SD_LBP_ZERO:
> max_blocks =3D min_not_zero(sdkp->max_ws_blocks,
> (u32)SD_MAX_WS10_BLOCKS);
> @@ -711,16 +724,20 @@ static int sd_setup_discard_cmnd(struct scsi_cmnd *=
cmd)
> unsigned int nr_sectors =3D blk_rq_sectors(rq);
> unsigned int nr_bytes =3D blk_rq_bytes(rq);
> unsigned int len;
> - int ret;
> + int ret =3D BLKPREP_OK;
> char *buf;
> - struct page *page;
> + struct page *page =3D NULL;
>
> sector >>=3D ilog2(sdp->sector_size) - 9;
> nr_sectors >>=3D ilog2(sdp->sector_size) - 9;
>
> - page =3D alloc_page(GFP_ATOMIC | __GFP_ZERO);
> - if (!page)
> - return BLKPREP_DEFER;
> + if (sdkp->provisioning_mode !=3D SD_ZBC_RESET_WP) {
> + page =3D alloc_page(GFP_ATOMIC | __GFP_ZERO);
> + if (!page)
> + return BLKPREP_DEFER;
> + }
> +
> + rq->completion_data =3D page;
>
> switch (sdkp->provisioning_mode) {
> case SD_LBP_UNMAP:
> @@ -760,12 +777,19 @@ static int sd_setup_discard_cmnd(struct scsi_cmnd *=
cmd)
> len =3D sdkp->device->sector_size;
> break;
>
> + case SD_ZBC_RESET_WP:
> + ret =3D sd_zbc_setup_reset_cmnd(cmd);
> + if (ret !=3D BLKPREP_OK)
> + goto out;
> + /* Reset Write Pointer doesn't have a payload */
> + len =3D 0;
> + break;
> +
> default:
> ret =3D BLKPREP_INVALID;
> goto out;
> }
>
> - rq->completion_data =3D page;
> rq->timeout =3D SD_TIMEOUT;
>
> cmd->transfersize =3D len;
> @@ -779,13 +803,17 @@ static int sd_setup_discard_cmnd(struct scsi_cmnd *=
cmd)
> * discarded on disk. This allows us to report completion on the =
full
> * amount of blocks described by the request.
> */
> - blk_add_request_payload(rq, page, 0, len);
> - ret =3D scsi_init_io(cmd);
> + if (len) {
> + blk_add_request_payload(rq, page, 0, len);
> + ret =3D scsi_init_io(cmd);
> + }
> rq->__data_len =3D nr_bytes;
>
> out:
> - if (ret !=3D BLKPREP_OK)
> + if (page && ret !=3D BLKPREP_OK) {
> + rq->completion_data =3D NULL;
> __free_page(page);
> + }
> return ret;
> }
>
> @@ -843,6 +871,13 @@ static int sd_setup_write_same_cmnd(struct scsi_cmnd=
*cmd)
>
> BUG_ON(bio_offset(bio) || bio_iovec(bio).bv_len !=3D sdp->sector_=
size);
>
> + if (sdkp->zoned =3D=3D 1 || sdp->type =3D=3D TYPE_ZBC) {
> + /* sd_zbc_setup_read_write uses block layer sector units =
*/
> + ret =3D sd_zbc_setup_read_write(sdkp, rq, sector, &nr_sec=
tors);
> + if (ret !=3D BLKPREP_OK)
> + return ret;
> + }
> +
> sector >>=3D ilog2(sdp->sector_size) - 9;
> nr_sectors >>=3D ilog2(sdp->sector_size) - 9;
>
> @@ -962,6 +997,13 @@ static int sd_setup_read_write_cmnd(struct scsi_cmnd=
*SCpnt)
> SCSI_LOG_HLQUEUE(2, scmd_printk(KERN_INFO, SCpnt, "block=3D%llu\n=
",
> (unsigned long long)block));
>
> + if (sdkp->zoned =3D=3D 1 || sdp->type =3D=3D TYPE_ZBC) {
> + /* sd_zbc_setup_read_write uses block layer sector units =
*/
> + ret =3D sd_zbc_setup_read_write(sdkp, rq, block, &this_co=
unt);
> + if (ret !=3D BLKPREP_OK)
> + goto out;
> + }
> +
> /*
> * If we have a 1K hardware sectorsize, prevent access to single
> * 512 byte sectors. In theory we could handle this - in fact
> @@ -1148,6 +1190,16 @@ static int sd_init_command(struct scsi_cmnd *cmd)
> case REQ_OP_READ:
> case REQ_OP_WRITE:
> return sd_setup_read_write_cmnd(cmd);
> + case REQ_OP_ZONE_REPORT:
> + return sd_zbc_setup_report_cmnd(cmd);
> + case REQ_OP_ZONE_RESET:
> + return sd_zbc_setup_reset_cmnd(cmd);
> + case REQ_OP_ZONE_OPEN:
> + return sd_zbc_setup_open_cmnd(cmd);
> + case REQ_OP_ZONE_CLOSE:
> + return sd_zbc_setup_close_cmnd(cmd);
> + case REQ_OP_ZONE_FINISH:
> + return sd_zbc_setup_finish_cmnd(cmd);
> default:
> BUG();
> }
> @@ -1157,7 +1209,8 @@ static void sd_uninit_command(struct scsi_cmnd *SCp=
nt)
> {
> struct request *rq =3D SCpnt->request;
>
> - if (req_op(rq) =3D=3D REQ_OP_DISCARD)
> + if (req_op(rq) =3D=3D REQ_OP_DISCARD &&
> + rq->completion_data)
> __free_page(rq->completion_data);
>
> if (SCpnt->cmnd !=3D rq->cmd) {
> @@ -1778,8 +1831,16 @@ static int sd_done(struct scsi_cmnd *SCpnt)
> int sense_deferred =3D 0;
> unsigned char op =3D SCpnt->cmnd[0];
> unsigned char unmap =3D SCpnt->cmnd[1] & 8;
> + unsigned char sa =3D SCpnt->cmnd[1] & 0xf;
>
> - if (req_op(req) =3D=3D REQ_OP_DISCARD || req_op(req) =3D=3D REQ_O=
P_WRITE_SAME) {
> + switch(req_op(req)) {
> + case REQ_OP_DISCARD:
> + case REQ_OP_WRITE_SAME:
> + case REQ_OP_ZONE_REPORT:
> + case REQ_OP_ZONE_RESET:
> + case REQ_OP_ZONE_OPEN:
> + case REQ_OP_ZONE_CLOSE:
> + case REQ_OP_ZONE_FINISH:
> if (!result) {
> good_bytes =3D blk_rq_bytes(req);
> scsi_set_resid(SCpnt, 0);
> @@ -1787,6 +1848,7 @@ static int sd_done(struct scsi_cmnd *SCpnt)
> good_bytes =3D 0;
> scsi_set_resid(SCpnt, blk_rq_bytes(req));
> }
> + break;
> }
>
> if (result) {
> @@ -1829,6 +1891,10 @@ static int sd_done(struct scsi_cmnd *SCpnt)
> case UNMAP:
> sd_config_discard(sdkp, SD_LBP_DISABLE);
> break;
> + case ZBC_OUT:
> + if (sa =3D=3D ZO_RESET_WRITE_POINTER)
> + sd_config_discard(sdkp, SD_LBP_DI=
SABLE);
> + break;
> case WRITE_SAME_16:
> case WRITE_SAME:
> if (unmap)
> @@ -1847,7 +1913,11 @@ static int sd_done(struct scsi_cmnd *SCpnt)
> default:
> break;
> }
> +
> out:
> + if (sdkp->zoned =3D=3D 1 || sdkp->device->type =3D=3D TYPE_ZBC)
> + sd_zbc_done(SCpnt, &sshdr);
> +
> SCSI_LOG_HLCOMPLETE(1, scmd_printk(KERN_INFO, SCpnt,
> "sd_done: completed %d of %d b=
ytes\n",
> good_bytes, scsi_bufflen(SCpnt=
)));
> @@ -1982,7 +2052,6 @@ sd_spinup_disk(struct scsi_disk *sdkp)
> }
> }
>
> -
> /*
> * Determine whether disk supports Data Integrity Field.
> */
> @@ -2132,6 +2201,9 @@ static int read_capacity_16(struct scsi_disk *sdkp,=
struct scsi_device *sdp,
> /* Logical blocks per physical block exponent */
> sdkp->physical_block_size =3D (1 << (buffer[13] & 0xf)) * sector_=
size;
>
> + /* RC basis */
> + sdkp->rc_basis =3D (buffer[12] >> 4) & 0x3;
> +
> /* Lowest aligned logical block */
> alignment =3D ((buffer[14] & 0x3f) << 8 | buffer[15]) * sector_si=
ze;
> blk_queue_alignment_offset(sdp->request_queue, alignment);
> @@ -2322,6 +2394,11 @@ got_data:
> sector_size =3D 512;
> }
> blk_queue_logical_block_size(sdp->request_queue, sector_size);
> + blk_queue_physical_block_size(sdp->request_queue,
> + sdkp->physical_block_size);
> + sdkp->device->sector_size =3D sector_size;
> +
> + sd_zbc_read_zones(sdkp, buffer);
>
> {
> char cap_str_2[10], cap_str_10[10];
> @@ -2348,9 +2425,6 @@ got_data:
> if (sdkp->capacity > 0xffffffff)
> sdp->use_16_for_rw =3D 1;
>
> - blk_queue_physical_block_size(sdp->request_queue,
> - sdkp->physical_block_size);
> - sdkp->device->sector_size =3D sector_size;
> }
>
> /* called with buffer of length 512 */
> @@ -2612,7 +2686,7 @@ static void sd_read_app_tag_own(struct scsi_disk *s=
dkp, unsigned char *buffer)
> struct scsi_mode_data data;
> struct scsi_sense_hdr sshdr;
>
> - if (sdp->type !=3D TYPE_DISK)
> + if (sdp->type !=3D TYPE_DISK && sdp->type !=3D TYPE_ZBC)
> return;
>
> if (sdkp->protection_type =3D=3D 0)
> @@ -2719,6 +2793,7 @@ static void sd_read_block_limits(struct scsi_disk *=
sdkp)
> */
> static void sd_read_block_characteristics(struct scsi_disk *sdkp)
> {
> + struct request_queue *q =3D sdkp->disk->queue;
> unsigned char *buffer;
> u16 rot;
> const int vpd_len =3D 64;
> @@ -2733,10 +2808,21 @@ static void sd_read_block_characteristics(struct =
scsi_disk *sdkp)
> rot =3D get_unaligned_be16(&buffer[4]);
>
> if (rot =3D=3D 1) {
> - queue_flag_set_unlocked(QUEUE_FLAG_NONROT, sdkp->disk->qu=
eue);
> - queue_flag_clear_unlocked(QUEUE_FLAG_ADD_RANDOM, sdkp->di=
sk->queue);
> + queue_flag_set_unlocked(QUEUE_FLAG_NONROT, q);
> + queue_flag_clear_unlocked(QUEUE_FLAG_ADD_RANDOM, q);
> }
>
> + sdkp->zoned =3D (buffer[8] >> 4) & 3;
> + if (sdkp->zoned =3D=3D 1)
> + q->limits.zoned =3D BLK_ZONED_HA;
> + else if (sdkp->device->type =3D=3D TYPE_ZBC)
> + q->limits.zoned =3D BLK_ZONED_HM;
> + else
> + q->limits.zoned =3D BLK_ZONED_NONE;
> + if (blk_queue_zoned(q) && sdkp->first_scan)
> + sd_printk(KERN_NOTICE, sdkp, "Host-%s zoned block device\=
n",
> + q->limits.zoned =3D=3D BLK_ZONED_HM ? "managed"=
: "aware");
> +
> out:
> kfree(buffer);
> }
> @@ -2835,14 +2921,14 @@ static int sd_revalidate_disk(struct gendisk *dis=
k)
> * react badly if we do.
> */
> if (sdkp->media_present) {
> - sd_read_capacity(sdkp, buffer);
> -
> if (scsi_device_supports_vpd(sdp)) {
> sd_read_block_provisioning(sdkp);
> sd_read_block_limits(sdkp);
> sd_read_block_characteristics(sdkp);
> }
>
> + sd_read_capacity(sdkp, buffer);
> +
> sd_read_write_protect_flag(sdkp, buffer);
> sd_read_cache_type(sdkp, buffer);
> sd_read_app_tag_own(sdkp, buffer);
> @@ -3040,9 +3126,16 @@ static int sd_probe(struct device *dev)
>
> scsi_autopm_get_device(sdp);
> error =3D -ENODEV;
> - if (sdp->type !=3D TYPE_DISK && sdp->type !=3D TYPE_MOD && sdp->t=
ype !=3D TYPE_RBC)
> + if (sdp->type !=3D TYPE_DISK &&
> + sdp->type !=3D TYPE_ZBC &&
> + sdp->type !=3D TYPE_MOD &&
> + sdp->type !=3D TYPE_RBC)
> goto out;
>
> +#ifndef CONFIG_BLK_DEV_ZONED
> + if (sdp->type =3D=3D TYPE_ZBC)
> + goto out;
> +#endif
> SCSI_LOG_HLQUEUE(3, sdev_printk(KERN_INFO, sdp,
> "sd_probe\n"));
>
> @@ -3146,6 +3239,8 @@ static int sd_remove(struct device *dev)
> del_gendisk(sdkp->disk);
> sd_shutdown(dev);
>
> + sd_zbc_remove(sdkp);
> +
> blk_register_region(devt, SD_MINORS, NULL,
> sd_default_probe, NULL, NULL);
>
> diff --git a/drivers/scsi/sd.h b/drivers/scsi/sd.h
> index 765a6f1..3452871 100644
> --- a/drivers/scsi/sd.h
> +++ b/drivers/scsi/sd.h
> @@ -56,6 +56,7 @@ enum {
> SD_LBP_WS16, /* Use WRITE SAME(16) with UNMAP bit */
> SD_LBP_WS10, /* Use WRITE SAME(10) with UNMAP bit */
> SD_LBP_ZERO, /* Use WRITE SAME(10) with zero payload *=
/
> + SD_ZBC_RESET_WP, /* Use RESET WRITE POINTER */
> SD_LBP_DISABLE, /* Discard disabled due to failed cmd */
> };
>
Can we have adding SD_ZBC_RESET_WP as a separate patch?
> @@ -64,6 +65,11 @@ struct scsi_disk {
> struct scsi_device *device;
> struct device dev;
> struct gendisk *disk;
> +#ifdef CONFIG_BLK_DEV_ZONED
> + struct workqueue_struct *zone_work_q;
> + sector_t zone_sectors;
> + unsigned int nr_zones;
> +#endif
> atomic_t openers;
> sector_t capacity; /* size in logical blocks */
> u32 max_xfer_blocks;
> @@ -94,6 +100,8 @@ struct scsi_disk {
> unsigned lbpvpd : 1;
> unsigned ws10 : 1;
> unsigned ws16 : 1;
> + unsigned rc_basis: 2;
> + unsigned zoned: 2;
> };
> #define to_scsi_disk(obj) container_of(obj,struct scsi_disk,dev)
>
> @@ -156,6 +164,13 @@ static inline unsigned int logical_to_bytes(struct s=
csi_device *sdev, sector_t b
> return blocks * sdev->sector_size;
> }
>
> +static inline sector_t sectors_to_logical(struct scsi_device *sdev, sect=
or_t sector)
> +{
> + return sector >> (ilog2(sdev->sector_size) - 9);
> +}
> +
> +extern void sd_config_discard(struct scsi_disk *, unsigned int);
> +
> /*
> * A DIF-capable target device can be formatted with different
> * protection schemes. Currently 0 through 3 are defined:
> @@ -269,4 +284,57 @@ static inline void sd_dif_complete(struct scsi_cmnd =
*cmd, unsigned int a)
>
> #endif /* CONFIG_BLK_DEV_INTEGRITY */
>
> +#ifdef CONFIG_BLK_DEV_ZONED
> +
> +extern void sd_zbc_read_zones(struct scsi_disk *, char *);
> +extern void sd_zbc_remove(struct scsi_disk *);
> +extern int sd_zbc_setup_read_write(struct scsi_disk *, struct request *,
> + sector_t, unsigned int *);
> +extern int sd_zbc_setup_report_cmnd(struct scsi_cmnd *);
> +extern int sd_zbc_setup_reset_cmnd(struct scsi_cmnd *);
> +extern int sd_zbc_setup_open_cmnd(struct scsi_cmnd *);
> +extern int sd_zbc_setup_close_cmnd(struct scsi_cmnd *);
> +extern int sd_zbc_setup_finish_cmnd(struct scsi_cmnd *);
> +extern void sd_zbc_done(struct scsi_cmnd *, struct scsi_sense_hdr *);
> +
> +#else /* CONFIG_BLK_DEV_ZONED */
> +
> +static inline void sd_zbc_read_zones(struct scsi_disk *sdkp,
> + unsigned char *buf) {}
> +static inline void sd_zbc_remove(struct scsi_disk *sdkp) {}
> +
> +static inline int sd_zbc_setup_read_write(struct scsi_disk *sdkp,
> + struct request *rq, sector_t se=
ctor,
> + unsigned int *num_sectors)
> +{
> + /* Let the drive fail requests */
> + return BLKPREP_OK;
> +}
> +
> +static inline int sd_zbc_setup_report_cmnd(struct scsi_cmnd *cmd)
> +{
> + return BLKPREP_KILL;
> +}
> +static inline int sd_zbc_setup_reset_cmnd(struct scsi_cmnd *cmd)
> +{
> + return BLKPREP_KILL;
> +}
> +static inline int sd_zbc_setup_open_cmnd(struct scsi_cmnd *cmd)
> +{
> + return BLKPREP_KILL;
> +}
> +static inline int sd_zbc_setup_close_cmnd(struct scsi_cmnd *cmd)
> +{
> + return BLKPREP_KILL;
> +}
> +static inline int sd_zbc_setup_finish_cmnd(struct scsi_cmnd *cmd)
> +{
> + return BLKPREP_KILL;
> +}
> +
> +static inline void sd_zbc_done(struct scsi_cmnd *cmd,
> + struct scsi_sense_hdr *sshdr) {}
> +
> +#endif /* CONFIG_BLK_DEV_ZONED */
> +
> #endif /* _SCSI_DISK_H */
> diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
> new file mode 100644
> index 0000000..ec9c3fc
> --- /dev/null
> +++ b/drivers/scsi/sd_zbc.c
> @@ -0,0 +1,1097 @@
> +/*
> + * SCSI Zoned Block commands
> + *
> + * Copyright (C) 2014-2015 SUSE Linux GmbH
> + * Written by: Hannes Reinecke <hare@suse.de>
> + * Modified by: Damien Le Moal <damien.lemoal@hgst.com>
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License version
> + * 2 as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful, but
> + * WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + * General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; see the file COPYING. If not, write to
> + * the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139,
> + * USA.
> + *
> + */
> +
> +#include <linux/blkdev.h>
> +#include <linux/rbtree.h>
> +
> +#include <asm/unaligned.h>
> +
> +#include <scsi/scsi.h>
> +#include <scsi/scsi_cmnd.h>
> +#include <scsi/scsi_dbg.h>
> +#include <scsi/scsi_device.h>
> +#include <scsi/scsi_driver.h>
> +#include <scsi/scsi_host.h>
> +#include <scsi/scsi_eh.h>
> +
> +#include "sd.h"
> +#include "scsi_priv.h"
> +
> +enum zbc_zone_type {
> + ZBC_ZONE_TYPE_CONV =3D 0x1,
> + ZBC_ZONE_TYPE_SEQWRITE_REQ,
> + ZBC_ZONE_TYPE_SEQWRITE_PREF,
> + ZBC_ZONE_TYPE_RESERVED,
> +};
> +
> +enum zbc_zone_cond {
> + ZBC_ZONE_COND_NO_WP,
> + ZBC_ZONE_COND_EMPTY,
> + ZBC_ZONE_COND_IMP_OPEN,
> + ZBC_ZONE_COND_EXP_OPEN,
> + ZBC_ZONE_COND_CLOSED,
> + ZBC_ZONE_COND_READONLY =3D 0xd,
> + ZBC_ZONE_COND_FULL,
> + ZBC_ZONE_COND_OFFLINE,
> +};
> +
> +#define SD_ZBC_BUF_SIZE 131072
> +
> +#define sd_zbc_debug(sdkp, fmt, args...) \
> + pr_debug("%s %s [%s]: " fmt, \
> + dev_driver_string(&(sdkp)->device->sdev_gendev), \
> + dev_name(&(sdkp)->device->sdev_gendev), \
> + (sdkp)->disk->disk_name, ## args)
> +
> +#define sd_zbc_debug_ratelimit(sdkp, fmt, args...) \
> + do { \
> + if (printk_ratelimit()) \
> + sd_zbc_debug(sdkp, fmt, ## args); \
> + } while( 0 )
> +
> +#define sd_zbc_err(sdkp, fmt, args...) \
> + pr_err("%s %s [%s]: " fmt, \
> + dev_driver_string(&(sdkp)->device->sdev_gendev), \
> + dev_name(&(sdkp)->device->sdev_gendev), \
> + (sdkp)->disk->disk_name, ## args)
> +
> +struct zbc_zone_work {
> + struct work_struct zone_work;
> + struct scsi_disk *sdkp;
> + sector_t sector;
> + sector_t nr_sects;
> + bool init;
> + unsigned int nr_zones;
> +};
> +
> +struct blk_zone *zbc_desc_to_zone(struct scsi_disk *sdkp, unsigned char =
*rec)
> +{
> + struct blk_zone *zone;
> +
> + zone =3D kzalloc(sizeof(struct blk_zone), GFP_KERNEL);
> + if (!zone)
> + return NULL;
> +
> + /* Zone type */
> + switch(rec[0] & 0x0f) {
> + case ZBC_ZONE_TYPE_CONV:
> + case ZBC_ZONE_TYPE_SEQWRITE_REQ:
> + case ZBC_ZONE_TYPE_SEQWRITE_PREF:
> + zone->type =3D rec[0] & 0x0f;
> + break;
> + default:
> + zone->type =3D BLK_ZONE_TYPE_UNKNOWN;
> + break;
> + }
> +
> + /* Zone condition */
> + zone->cond =3D (rec[1] >> 4) & 0xf;
> + if (rec[1] & 0x01)
> + zone->reset =3D 1;
> + if (rec[1] & 0x02)
> + zone->non_seq =3D 1;
> +
> + /* Zone start sector and length */
> + zone->len =3D logical_to_sectors(sdkp->device,
> + get_unaligned_be64(&rec[8]));
> + zone->start =3D logical_to_sectors(sdkp->device,
> + get_unaligned_be64(&rec[16]));
> +
> + /* Zone write pointer */
> + if (blk_zone_is_empty(zone) &&
> + zone->wp !=3D zone->start)
> + zone->wp =3D zone->start;
> + else if (blk_zone_is_full(zone))
> + zone->wp =3D zone->start + zone->len;
> + else if (blk_zone_is_seq(zone))
> + zone->wp =3D logical_to_sectors(sdkp->device,
> + get_unaligned_be64(&rec[24]=
));
> + else
> + zone->wp =3D (sector_t)-1;
> +
> + return zone;
> +}
> +
> +static int zbc_parse_zones(struct scsi_disk *sdkp, unsigned char *buf,
> + unsigned int buf_len, sector_t *next_sector)
> +{
> + struct request_queue *q =3D sdkp->disk->queue;
> + sector_t capacity =3D logical_to_sectors(sdkp->device, sdkp->capa=
city);
> + unsigned char *rec =3D buf;
> + unsigned int zone_len, list_length;
> +
> + /* Parse REPORT ZONES header */
> + list_length =3D get_unaligned_be32(&buf[0]);
> + rec =3D buf + 64;
> + list_length +=3D 64;
> +
> + if (list_length < buf_len)
> + buf_len =3D list_length;
> +
> + /* Parse REPORT ZONES zone descriptors */
> + *next_sector =3D capacity;
> + while (rec < buf + buf_len) {
> +
> + struct blk_zone *new, *old;
> +
> + new =3D zbc_desc_to_zone(sdkp, rec);
> + if (!new)
> + return -ENOMEM;
> +
> + zone_len =3D new->len;
> + *next_sector =3D new->start + zone_len;
> +
> + old =3D blk_insert_zone(q, new);
> + if (old) {
> + blk_lock_zone(old);
> +
> + /*
> + * Always update the zone state flags and the zon=
e
> + * offline and read-only condition as the drive m=
ay
> + * change those independently of the commands bei=
ng
> + * executed
> + */
> + old->reset =3D new->reset;
> + old->non_seq =3D new->non_seq;
> + if (blk_zone_is_offline(new) ||
> + blk_zone_is_readonly(new))
> + old->cond =3D new->cond;
> +
> + if (blk_zone_in_update(old)) {
> + old->cond =3D new->cond;
> + old->wp =3D new->wp;
> + blk_clear_zone_update(old);
> + }
> +
> + blk_unlock_zone(old);
> +
> + kfree(new);
> + }
> +
> + rec +=3D 64;
> +
> + }
> +
> + return 0;
> +}
> +
> +/**
> + * sd_zbc_report_zones - Issue a REPORT ZONES scsi command
> + * @sdkp: SCSI disk to which the command should be send
> + * @buffer: response buffer
> + * @bufflen: length of @buffer
> + * @start_sector: logical sector for the zone information should be repo=
rted
> + * @option: reporting option to be used
> + * @partial: flag to set the 'partial' bit for report zones command
> + */
> +int sd_zbc_report_zones(struct scsi_disk *sdkp, unsigned char *buffer,
> + int bufflen, sector_t start_sector,
> + enum zbc_zone_reporting_options option, bool part=
ial)
> +{
> + struct scsi_device *sdp =3D sdkp->device;
> + const int timeout =3D sdp->request_queue->rq_timeout;
> + struct scsi_sense_hdr sshdr;
> + sector_t start_lba =3D sectors_to_logical(sdkp->device, start_sec=
tor);
> + unsigned char cmd[16];
> + int result;
> +
> + if (!scsi_device_online(sdp))
> + return -ENODEV;
> +
> + sd_zbc_debug(sdkp, "REPORT ZONES lba %zu len %d\n",
> + start_lba, bufflen);
> +
> + memset(cmd, 0, 16);
> + cmd[0] =3D ZBC_IN;
> + cmd[1] =3D ZI_REPORT_ZONES;
> + put_unaligned_be64(start_lba, &cmd[2]);
> + put_unaligned_be32(bufflen, &cmd[10]);
> + cmd[14] =3D (partial ? ZBC_REPORT_ZONE_PARTIAL : 0) | option;
> + memset(buffer, 0, bufflen);
> +
> + result =3D scsi_execute_req(sdp, cmd, DMA_FROM_DEVICE,
> + buffer, bufflen, &sshdr,
> + timeout, SD_MAX_RETRIES, NULL);
> +
> + if (result) {
> + sd_zbc_err(sdkp,
> + "REPORT ZONES lba %zu failed with %d/%d\n",
> + start_lba, host_byte(result), driver_byte(resu=
lt));
> + return -EIO;
> + }
> +
> + return 0;
> +}
> +
> +/**
> + * Set or clear the update flag of all zones contained
> + * in the range sector..sector+nr_sects.
> + * Return the number of zones marked/cleared.
> + */
> +static int __sd_zbc_zones_updating(struct scsi_disk *sdkp,
> + sector_t sector, sector_t nr_sects,
> + bool set)
> +{
> + struct request_queue *q =3D sdkp->disk->queue;
> + struct blk_zone *zone;
> + struct rb_node *node;
> + unsigned long flags;
> + int nr_zones =3D 0;
> +
> + if (!nr_sects) {
> + /* All zones */
> + sector =3D 0;
> + nr_sects =3D logical_to_sectors(sdkp->device, sdkp->capac=
ity);
> + }
> +
> + spin_lock_irqsave(&q->zones_lock, flags);
> + for (node =3D rb_first(&q->zones); node && nr_sects; node =3D rb_=
next(node)) {
> + zone =3D rb_entry(node, struct blk_zone, node);
> + if (sector < zone->start || sector >=3D (zone->start + zo=
ne->len))
> + continue;
> + if (set) {
> + if (!test_and_set_bit_lock(BLK_ZONE_IN_UPDATE, &z=
one->flags))
> + nr_zones++;
> + } else if (test_and_clear_bit(BLK_ZONE_IN_UPDATE, &zone->=
flags)) {
> + wake_up_bit(&zone->flags, BLK_ZONE_IN_UPDATE);
> + nr_zones++;
> + }
> + sector =3D zone->start + zone->len;
> + if (nr_sects <=3D zone->len)
> + nr_sects =3D 0;
> + else
> + nr_sects -=3D zone->len;
> + }
> + spin_unlock_irqrestore(&q->zones_lock, flags);
> +
> + return nr_zones;
> +}
> +
> +static inline int sd_zbc_set_zones_updating(struct scsi_disk *sdkp,
> + sector_t sector, sector_t nr_=
sects)
> +{
> + return __sd_zbc_zones_updating(sdkp, sector, nr_sects, true);
> +}
> +
> +static inline int sd_zbc_clear_zones_updating(struct scsi_disk *sdkp,
> + sector_t sector, sector_t n=
r_sects)
> +{
> + return __sd_zbc_zones_updating(sdkp, sector, nr_sects, false);
> +}
> +
> +static void sd_zbc_start_queue(struct request_queue *q)
> +{
> + unsigned long flags;
> +
> + if (q->mq_ops) {
> + blk_mq_start_hw_queues(q);
> + } else {
> + spin_lock_irqsave(q->queue_lock, flags);
> + blk_start_queue(q);
> + spin_unlock_irqrestore(q->queue_lock, flags);
> + }
> +}
> +
> +static void sd_zbc_update_zone_work(struct work_struct *work)
> +{
> + struct zbc_zone_work *zwork =3D
> + container_of(work, struct zbc_zone_work, zone_work);
> + struct scsi_disk *sdkp =3D zwork->sdkp;
> + sector_t capacity =3D logical_to_sectors(sdkp->device, sdkp->capa=
city);
> + struct request_queue *q =3D sdkp->disk->queue;
> + sector_t end_sector, sector =3D zwork->sector;
> + unsigned int bufsize;
> + unsigned char *buf;
> + int ret =3D -ENOMEM;
> +
> + /* Get a buffer */
> + if (!zwork->nr_zones) {
> + bufsize =3D SD_ZBC_BUF_SIZE;
> + } else {
> + bufsize =3D (zwork->nr_zones + 1) * 64;
> + if (bufsize < 512)
> + bufsize =3D 512;
> + else if (bufsize > SD_ZBC_BUF_SIZE)
> + bufsize =3D SD_ZBC_BUF_SIZE;
> + else
> + bufsize =3D (bufsize + 511) & ~511;
> + }
> + buf =3D kmalloc(bufsize, GFP_KERNEL | GFP_DMA);
> + if (!buf) {
> + sd_zbc_err(sdkp, "Failed to allocate zone report buffer\n=
");
> + goto done_free;
> + }
> +
> + /* Process sector range */
> + end_sector =3D zwork->sector + zwork->nr_sects;
> + while(sector < min(end_sector, capacity)) {
> +
> + /* Get zone report */
> + ret =3D sd_zbc_report_zones(sdkp, buf, bufsize, sector,
> + ZBC_ZONE_REPORTING_OPTION_ALL, =
true);
> + if (ret)
> + break;
> +
> + ret =3D zbc_parse_zones(sdkp, buf, bufsize, §or);
> + if (ret)
> + break;
> +
> + /* Kick start the queue to allow requests waiting */
> + /* for the zones just updated to run */
> + sd_zbc_start_queue(q);
> +
> + }
> +
> +done_free:
> + if (ret)
> + sd_zbc_clear_zones_updating(sdkp, zwork->sector, zwork->n=
r_sects);
> + if (buf)
> + kfree(buf);
> + kfree(zwork);
> +}
> +
> +/**
> + * sd_zbc_update_zones - Update zone information for zones starting
> + * from @start_sector. If not in init mode, the update is done only
> + * for zones marked with update flag.
> + * @sdkp: SCSI disk for which the zone information needs to be updated
> + * @start_sector: First sector of the first zone to be updated
> + * @bufsize: buffersize to be allocated for report zones
> + */
> +static int sd_zbc_update_zones(struct scsi_disk *sdkp,
> + sector_t sector, sector_t nr_sects,
> + gfp_t gfpflags, bool init)
> +{
> + struct zbc_zone_work *zwork;
> +
> + zwork =3D kzalloc(sizeof(struct zbc_zone_work), gfpflags);
> + if (!zwork) {
> + sd_zbc_err(sdkp, "Failed to allocate zone work\n");
> + return -ENOMEM;
> + }
> +
> + if (!nr_sects) {
> + /* All zones */
> + sector =3D 0;
> + nr_sects =3D logical_to_sectors(sdkp->device, sdkp->capac=
ity);
> + }
> +
> + INIT_WORK(&zwork->zone_work, sd_zbc_update_zone_work);
> + zwork->sdkp =3D sdkp;
> + zwork->sector =3D sector;
> + zwork->nr_sects =3D nr_sects;
> + zwork->init =3D init;
> +
> + if (!init)
> + /* Mark the zones falling in the report as updating */
> + zwork->nr_zones =3D sd_zbc_set_zones_updating(sdkp, secto=
r, nr_sects);
> +
> + if (init || zwork->nr_zones)
> + queue_work(sdkp->zone_work_q, &zwork->zone_work);
> + else
> + kfree(zwork);
> +
> + return 0;
> +}
> +
> +int sd_zbc_setup_report_cmnd(struct scsi_cmnd *cmd)
> +{
> + struct request *rq =3D cmd->request;
> + struct gendisk *disk =3D rq->rq_disk;
> + struct scsi_disk *sdkp =3D scsi_disk(disk);
> + int ret;
> +
> + if (!sdkp->zone_work_q)
> + return BLKPREP_KILL;
> +
> + ret =3D sd_zbc_update_zones(sdkp, blk_rq_pos(rq), blk_rq_sectors(=
rq),
> + GFP_ATOMIC, false);
> + if (unlikely(ret))
> + return BLKPREP_DEFER;
> +
> + return BLKPREP_DONE;
> +}
> +
> +static void sd_zbc_setup_action_cmnd(struct scsi_cmnd *cmd,
> + u8 action,
> + bool all)
> +{
> + struct request *rq =3D cmd->request;
> + struct scsi_disk *sdkp =3D scsi_disk(rq->rq_disk);
> + sector_t lba;
> +
> + cmd->cmd_len =3D 16;
> + cmd->cmnd[0] =3D ZBC_OUT;
> + cmd->cmnd[1] =3D action;
> + if (all) {
> + cmd->cmnd[14] |=3D 0x01;
> + } else {
> + lba =3D sectors_to_logical(sdkp->device, blk_rq_pos(rq));
> + put_unaligned_be64(lba, &cmd->cmnd[2]);
> + }
> +
> + rq->completion_data =3D NULL;
> + rq->timeout =3D SD_TIMEOUT;
> + rq->__data_len =3D blk_rq_bytes(rq);
> +
> + /* Don't retry */
> + cmd->allowed =3D 0;
> + cmd->transfersize =3D 0;
> + cmd->sc_data_direction =3D DMA_NONE;
> +}
> +
> +int sd_zbc_setup_reset_cmnd(struct scsi_cmnd *cmd)
> +{
> + struct request *rq =3D cmd->request;
> + struct scsi_disk *sdkp =3D scsi_disk(rq->rq_disk);
> + sector_t sector =3D blk_rq_pos(rq);
> + sector_t nr_sects =3D blk_rq_sectors(rq);
> + struct blk_zone *zone =3D NULL;
> + int ret =3D BLKPREP_OK;
> +
> + if (nr_sects) {
> + zone =3D blk_lookup_zone(rq->q, sector);
> + if (!zone)
> + return BLKPREP_KILL;
> + }
> +
> + if (zone) {
> +
> + blk_lock_zone(zone);
> +
> + /* If the zone is being updated, wait */
> + if (blk_zone_in_update(zone)) {
> + ret =3D BLKPREP_DEFER;
> + goto out;
> + }
> +
> + if (zone->type =3D=3D BLK_ZONE_TYPE_UNKNOWN) {
> + sd_zbc_debug(sdkp,
> + "Discarding unknown zone %zu\n",
> + zone->start);
> + ret =3D BLKPREP_KILL;
> + goto out;
> + }
> +
> + /* Nothing to do for conventional sequential zones */
> + if (blk_zone_is_conv(zone)) {
> + ret =3D BLKPREP_DONE;
> + goto out;
> + }
> +
> + if (!blk_try_write_lock_zone(zone)) {
> + ret =3D BLKPREP_DEFER;
> + goto out;
> + }
> +
> + /* Nothing to do if the zone is already empty */
> + if (blk_zone_is_empty(zone)) {
> + blk_write_unlock_zone(zone);
> + ret =3D BLKPREP_DONE;
> + goto out;
> + }
> +
> + if (sector !=3D zone->start ||
> + (nr_sects !=3D zone->len)) {
> + sd_printk(KERN_ERR, sdkp,
> + "Unaligned reset wp request, start %zu/=
%zu"
> + " len %zu/%zu\n",
> + zone->start, sector, zone->len, nr_sect=
s);
> + blk_write_unlock_zone(zone);
> + ret =3D BLKPREP_KILL;
> + goto out;
> + }
> +
> + }
> +
> + sd_zbc_setup_action_cmnd(cmd, ZO_RESET_WRITE_POINTER, !zone);
> +
> +out:
> + if (zone) {
> + if (ret =3D=3D BLKPREP_OK) {
> + /*
> + * Opportunistic update. Will be fixed up
> + * with zone update if the command fails,
> + */
> + zone->wp =3D zone->start;
> + zone->cond =3D BLK_ZONE_COND_EMPTY;
> + zone->reset =3D 0;
> + zone->non_seq =3D 0;
> + }
> + blk_unlock_zone(zone);
> + }
> +
> + return ret;
> +}
> +
> +int sd_zbc_setup_open_cmnd(struct scsi_cmnd *cmd)
> +{
> + struct request *rq =3D cmd->request;
> + struct scsi_disk *sdkp =3D scsi_disk(rq->rq_disk);
> + sector_t sector =3D blk_rq_pos(rq);
> + sector_t nr_sects =3D blk_rq_sectors(rq);
> + struct blk_zone *zone =3D NULL;
> + int ret =3D BLKPREP_OK;
> +
> + if (nr_sects) {
> + zone =3D blk_lookup_zone(rq->q, sector);
> + if (!zone)
> + return BLKPREP_KILL;
> + }
> +
> + if (zone) {
> +
> + blk_lock_zone(zone);
> +
> + /* If the zone is being updated, wait */
> + if (blk_zone_in_update(zone)) {
> + ret =3D BLKPREP_DEFER;
> + goto out;
> + }
> +
> + if (zone->type =3D=3D BLK_ZONE_TYPE_UNKNOWN) {
> + sd_zbc_debug(sdkp,
> + "Opening unknown zone %zu\n",
> + zone->start);
> + ret =3D BLKPREP_KILL;
> + goto out;
> + }
> +
> + /*
> + * Nothing to do for conventional zones,
> + * zones already open or full zones.
> + */
> + if (blk_zone_is_conv(zone) ||
> + blk_zone_is_open(zone) ||
> + blk_zone_is_full(zone)) {
> + ret =3D BLKPREP_DONE;
> + goto out;
> + }
> +
> + if (sector !=3D zone->start ||
> + (nr_sects !=3D zone->len)) {
> + sd_printk(KERN_ERR, sdkp,
> + "Unaligned open zone request, start %zu=
/%zu"
> + " len %zu/%zu\n",
> + zone->start, sector, zone->len, nr_sect=
s);
> + ret =3D BLKPREP_KILL;
> + goto out;
> + }
> +
> + }
> +
> + sd_zbc_setup_action_cmnd(cmd, ZO_OPEN_ZONE, !zone);
> +
> +out:
> + if (zone) {
> + if (ret =3D=3D BLKPREP_OK)
> + /*
> + * Opportunistic update. Will be fixed up
> + * with zone update if the command fails.
> + */
> + zone->cond =3D BLK_ZONE_COND_EXP_OPEN;
> + blk_unlock_zone(zone);
> + }
> +
> + return ret;
> +}
> +
> +int sd_zbc_setup_close_cmnd(struct scsi_cmnd *cmd)
> +{
> + struct request *rq =3D cmd->request;
> + struct scsi_disk *sdkp =3D scsi_disk(rq->rq_disk);
> + sector_t sector =3D blk_rq_pos(rq);
> + sector_t nr_sects =3D blk_rq_sectors(rq);
> + struct blk_zone *zone =3D NULL;
> + int ret =3D BLKPREP_OK;
> +
> + if (nr_sects) {
> + zone =3D blk_lookup_zone(rq->q, sector);
> + if (!zone)
> + return BLKPREP_KILL;
> + }
> +
> + if (zone) {
> +
> + blk_lock_zone(zone);
> +
> + /* If the zone is being updated, wait */
> + if (blk_zone_in_update(zone)) {
> + ret =3D BLKPREP_DEFER;
> + goto out;
> + }
> +
> + if (zone->type =3D=3D BLK_ZONE_TYPE_UNKNOWN) {
> + sd_zbc_debug(sdkp,
> + "Closing unknown zone %zu\n",
> + zone->start);
> + ret =3D BLKPREP_KILL;
> + goto out;
> + }
> +
> + /*
> + * Nothing to do for conventional zones,
> + * full zones or empty zones.
> + */
> + if (blk_zone_is_conv(zone) ||
> + blk_zone_is_full(zone) ||
> + blk_zone_is_empty(zone)) {
> + ret =3D BLKPREP_DONE;
> + goto out;
> + }
> +
> + if (sector !=3D zone->start ||
> + (nr_sects !=3D zone->len)) {
> + sd_printk(KERN_ERR, sdkp,
> + "Unaligned close zone request, start %z=
u/%zu"
> + " len %zu/%zu\n",
> + zone->start, sector, zone->len, nr_sect=
s);
> + ret =3D BLKPREP_KILL;
> + goto out;
> + }
> +
> + }
> +
> + sd_zbc_setup_action_cmnd(cmd, ZO_CLOSE_ZONE, !zone);
> +
> +out:
> + if (zone) {
> + if (ret =3D=3D BLKPREP_OK)
> + /*
> + * Opportunistic update. Will be fixed up
> + * with zone update if the command fails.
> + */
> + zone->cond =3D BLK_ZONE_COND_CLOSED;
> + blk_unlock_zone(zone);
> + }
> +
> + return ret;
> +}
> +
> +int sd_zbc_setup_finish_cmnd(struct scsi_cmnd *cmd)
> +{
> + struct request *rq =3D cmd->request;
> + struct scsi_disk *sdkp =3D scsi_disk(rq->rq_disk);
> + sector_t sector =3D blk_rq_pos(rq);
> + sector_t nr_sects =3D blk_rq_sectors(rq);
> + struct blk_zone *zone =3D NULL;
> + int ret =3D BLKPREP_OK;
> +
> + if (nr_sects) {
> + zone =3D blk_lookup_zone(rq->q, sector);
> + if (!zone)
> + return BLKPREP_KILL;
> + }
> +
> + if (zone) {
> +
> + blk_lock_zone(zone);
> +
> + /* If the zone is being updated, wait */
> + if (blk_zone_in_update(zone)) {
> + ret =3D BLKPREP_DEFER;
> + goto out;
> + }
> +
> + if (zone->type =3D=3D BLK_ZONE_TYPE_UNKNOWN) {
> + sd_zbc_debug(sdkp,
> + "Finishing unknown zone %zu\n",
> + zone->start);
> + ret =3D BLKPREP_KILL;
> + goto out;
> + }
> +
> + /* Nothing to do for conventional zones and full zones */
> + if (blk_zone_is_conv(zone) ||
> + blk_zone_is_full(zone)) {
> + ret =3D BLKPREP_DONE;
> + goto out;
> + }
> +
> + if (sector !=3D zone->start ||
> + (nr_sects !=3D zone->len)) {
> + sd_printk(KERN_ERR, sdkp,
> + "Unaligned finish zone request, start %=
zu/%zu"
> + " len %zu/%zu\n",
> + zone->start, sector, zone->len, nr_sect=
s);
> + ret =3D BLKPREP_KILL;
> + goto out;
> + }
> +
> + }
> +
> + sd_zbc_setup_action_cmnd(cmd, ZO_FINISH_ZONE, !zone);
> +
> +out:
> + if (zone) {
> + if (ret =3D=3D BLKPREP_OK) {
> + /*
> + * Opportunistic update. Will be fixed up
> + * with zone update if the command fails.
> + */
> + zone->cond =3D BLK_ZONE_COND_FULL;
> + if (blk_zone_is_seq(zone))
> + zone->wp =3D zone->start + zone->len;
> + }
> + blk_unlock_zone(zone);
> + }
> +
> + return ret;
> +}
> +
Would be nice to have open/close/finish/reset share a little more code.
> +int sd_zbc_setup_read_write(struct scsi_disk *sdkp, struct request *rq,
> + sector_t sector, unsigned int *num_sectors)
> +{
> + struct blk_zone *zone;
> + unsigned int sectors =3D *num_sectors;
> + int ret =3D BLKPREP_OK;
> +
> + zone =3D blk_lookup_zone(rq->q, sector);
> + if (!zone)
> + /* Let the drive handle the request */
> + return BLKPREP_OK;
> +
> + blk_lock_zone(zone);
> +
> + /* If the zone is being updated, wait */
> + if (blk_zone_in_update(zone)) {
> + ret =3D BLKPREP_DEFER;
> + goto out;
> + }
> +
> + if (zone->type =3D=3D BLK_ZONE_TYPE_UNKNOWN) {
> + sd_zbc_debug(sdkp,
> + "Unknown zone %zu\n",
> + zone->start);
> + ret =3D BLKPREP_KILL;
> + goto out;
> + }
> +
> + /* For offline and read-only zones, let the drive fail the comman=
d */
> + if (blk_zone_is_offline(zone) ||
> + blk_zone_is_readonly(zone))
> + goto out;
> +
> + /* Do not allow zone boundaries crossing */
> + if (sector + sectors > zone->start + zone->len) {
> + ret =3D BLKPREP_KILL;
> + goto out;
> + }
> +
> + /* For conventional zones, no checks */
> + if (blk_zone_is_conv(zone))
> + goto out;
> +
> + if (req_op(rq) =3D=3D REQ_OP_WRITE ||
> + req_op(rq) =3D=3D REQ_OP_WRITE_SAME) {
> +
> + /*
> + * Write requests may change the write pointer and
> + * transition the zone condition to full. Changes
> + * are oportunistic here. If the request fails, a
> + * zone update will fix the zone information.
> + */
> + if (blk_zone_is_seq_req(zone)) {
> +
> + /*
> + * Do not issue more than one write at a time per
> + * zone. This solves write ordering problems due =
to
> + * the unlocking of the request queue in the disp=
atch
> + * path in the non scsi-mq case. For scsi-mq, thi=
s
> + * also avoids potential write reordering when mu=
ltiple
> + * threads running on different CPUs write to the=
same
> + * zone (with a synchronized sequential pattern).
> + */
> + if (!blk_try_write_lock_zone(zone)) {
> + ret =3D BLKPREP_DEFER;
> + goto out;
> + }
> +
> + /* For host-managed drives, writes are allowed */
> + /* only at the write pointer position. */
> + if (zone->wp !=3D sector) {
> + blk_write_unlock_zone(zone);
> + ret =3D BLKPREP_KILL;
> + goto out;
> + }
> +
> + zone->wp +=3D sectors;
> + if (zone->wp >=3D zone->start + zone->len) {
> + zone->cond =3D BLK_ZONE_COND_FULL;
> + zone->wp =3D zone->start + zone->len;
> + }
> +
> + } else {
> +
> + /* For host-aware drives, writes are allowed */
> + /* anywhere in the zone, but wp can only go */
> + /* forward. */
> + sector_t end_sector =3D sector + sectors;
> + if (sector =3D=3D zone->wp &&
> + end_sector >=3D zone->start + zone->len) {
> + zone->cond =3D BLK_ZONE_COND_FULL;
> + zone->wp =3D zone->start + zone->len;
> + } else if (end_sector > zone->wp) {
> + zone->wp =3D end_sector;
> + }
> +
> + }
> +
> + } else {
> +
If the drive does not have restricted reads
the just goto out here.
Not all HM drives will have restricted reads and
no HA drives have restricted reads.
> + /* Check read after write pointer */
> + if (sector + sectors <=3D zone->wp)
> + goto out;
> +
> + if (zone->wp <=3D sector) {
> + /* Read beyond WP: clear request buffer */
> + struct req_iterator iter;
> + struct bio_vec bvec;
> + unsigned long flags;
> + void *buf;
> + rq_for_each_segment(bvec, rq, iter) {
> + buf =3D bvec_kmap_irq(&bvec, &flags);
> + memset(buf, 0, bvec.bv_len);
> + flush_dcache_page(bvec.bv_page);
> + bvec_kunmap_irq(buf, &flags);
> + }
> + ret =3D BLKPREP_DONE;
> + goto out;
> + }
> +
> + /* Read straddle WP position: limit request size */
> + *num_sectors =3D zone->wp - sector;
> +
> + }
> +
> +out:
> + blk_unlock_zone(zone);
> +
> + return ret;
> +}
> +
> +void sd_zbc_done(struct scsi_cmnd *cmd,
> + struct scsi_sense_hdr *sshdr)
> +{
> + int result =3D cmd->result;
> + struct request *rq =3D cmd->request;
> + struct scsi_disk *sdkp =3D scsi_disk(rq->rq_disk);
> + struct request_queue *q =3D sdkp->disk->queue;
> + sector_t pos =3D blk_rq_pos(rq);
> + struct blk_zone *zone =3D NULL;
> + bool write_unlock =3D false;
> +
> + /*
> + * Get the target zone of commands of interest. Some may
> + * apply to all zones so check the request sectors first.
> + */
> + switch (req_op(rq)) {
> + case REQ_OP_DISCARD:
> + case REQ_OP_WRITE:
> + case REQ_OP_WRITE_SAME:
> + case REQ_OP_ZONE_RESET:
> + write_unlock =3D true;
> + /* fallthru */
> + case REQ_OP_ZONE_OPEN:
> + case REQ_OP_ZONE_CLOSE:
> + case REQ_OP_ZONE_FINISH:
> + if (blk_rq_sectors(rq))
> + zone =3D blk_lookup_zone(q, pos);
> + break;
> + }
> +
> + if (zone && write_unlock)
> + blk_write_unlock_zone(zone);
> +
> + if (!result)
> + return;
> +
> + if (sshdr->sense_key =3D=3D ILLEGAL_REQUEST &&
> + sshdr->asc =3D=3D 0x21)
> + /*
> + * It is unlikely that retrying requests failed with any
> + * kind of alignement error will result in success. So do=
n't
> + * try. Report the error back to the user quickly so that
> + * corrective actions can be taken after obtaining update=
d
> + * zone information.
> + */
> + cmd->allowed =3D 0;
> +
> + /* On error, force an update unless this is a failed report */
> + if (req_op(rq) =3D=3D REQ_OP_ZONE_REPORT)
> + sd_zbc_clear_zones_updating(sdkp, pos, blk_rq_sectors(rq)=
);
> + else if (zone)
> + sd_zbc_update_zones(sdkp, zone->start, zone->len,
> + GFP_ATOMIC, false);
> +}
> +
> +void sd_zbc_read_zones(struct scsi_disk *sdkp, char *buf)
> +{
> + struct request_queue *q =3D sdkp->disk->queue;
> + struct blk_zone *zone;
> + sector_t capacity;
> + sector_t sector;
> + bool init =3D false;
> + u32 rep_len;
> + int ret =3D 0;
> +
> + if (sdkp->zoned !=3D 1 && sdkp->device->type !=3D TYPE_ZBC)
> + /*
> + * Device managed or normal SCSI disk,
> + * no special handling required
> + */
> + return;
> +
> + /* Do a report zone to get the maximum LBA to check capacity */
> + ret =3D sd_zbc_report_zones(sdkp, buf, SD_BUF_SIZE,
> + 0, ZBC_ZONE_REPORTING_OPTION_ALL, false=
);
> + if (ret < 0)
> + return;
> +
> + rep_len =3D get_unaligned_be32(&buf[0]);
> + if (rep_len < 64) {
> + sd_printk(KERN_WARNING, sdkp,
> + "REPORT ZONES report invalid length %u\n",
> + rep_len);
> + return;
> + }
> +
> + if (sdkp->rc_basis =3D=3D 0) {
> + /* The max_lba field is the capacity of this device */
> + sector_t lba =3D get_unaligned_be64(&buf[8]);
> + if (lba + 1 > sdkp->capacity) {
> + if (sdkp->first_scan)
> + sd_printk(KERN_WARNING, sdkp,
> + "Changing capacity from %zu "
> + "to max LBA+1 %zu\n",
> + sdkp->capacity,
> + (sector_t) lba + 1);
> + sdkp->capacity =3D lba + 1;
> + }
> + }
> +
> + /* Setup the zone work queue */
> + if (! sdkp->zone_work_q) {
> + sdkp->zone_work_q =3D
> + alloc_ordered_workqueue("zbc_wq_%s", WQ_MEM_RECLA=
IM,
> + sdkp->disk->disk_name);
> + if (!sdkp->zone_work_q) {
> + sdev_printk(KERN_WARNING, sdkp->device,
> + "Create zoned disk workqueue failed\n=
");
> + return;
> + }
> + init =3D true;
> + }
> +
> + /*
> + * Parse what we already got. If all zones are not parsed yet,
> + * kick start an update to get the remaining.
> + */
> + capacity =3D logical_to_sectors(sdkp->device, sdkp->capacity);
> + ret =3D zbc_parse_zones(sdkp, buf, SD_BUF_SIZE, §or);
> + if (ret =3D=3D 0 && sector < capacity) {
> + sd_zbc_update_zones(sdkp, sector, capacity - sector,
> + GFP_KERNEL, init);
> + drain_workqueue(sdkp->zone_work_q);
> + }
> + if (ret)
> + return;
> +
> + /*
> + * Analyze the zones layout: if all zones are the same size and
> + * the size is a power of 2, chunk the device and map discard to
> + * reset write pointer command. Otherwise, disable discard.
> + */
> + sdkp->zone_sectors =3D 0;
> + sdkp->nr_zones =3D 0;
> + sector =3D 0;
> + while(sector < capacity) {
> +
> + zone =3D blk_lookup_zone(q, sector);
> + if (!zone) {
> + sdkp->zone_sectors =3D 0;
> + sdkp->nr_zones =3D 0;
> + break;
> + }
> +
> + sector +=3D zone->len;
> +
> + if (sdkp->zone_sectors =3D=3D 0) {
> + sdkp->zone_sectors =3D zone->len;
> + } else if (sector !=3D capacity &&
> + zone->len !=3D sdkp->zone_sectors) {
> + sdkp->zone_sectors =3D 0;
> + sdkp->nr_zones =3D 0;
> + break;
> + }
> +
> + sdkp->nr_zones++;
> +
> + }
> +
> + if (!sdkp->zone_sectors ||
> + !is_power_of_2(sdkp->zone_sectors)) {
> + sd_config_discard(sdkp, SD_LBP_DISABLE);
> + if (sdkp->first_scan)
> + sd_printk(KERN_NOTICE, sdkp,
> + "%u zones (non constant zone size)\n",
> + sdkp->nr_zones);
> + return;
> + }
> +
> + /* Setup discard granularity to the zone size */
> + blk_queue_chunk_sectors(sdkp->disk->queue, sdkp->zone_sectors);
> + sdkp->max_unmap_blocks =3D sdkp->zone_sectors;
> + sdkp->unmap_alignment =3D sectors_to_logical(sdkp->device,
> + sdkp->zone_sectors);
> + sdkp->unmap_granularity =3D sdkp->unmap_alignment;
> + sd_config_discard(sdkp, SD_ZBC_RESET_WP);
> +
> + if (sdkp->first_scan) {
> + if (sdkp->nr_zones * sdkp->zone_sectors =3D=3D capacity)
> + sd_printk(KERN_NOTICE, sdkp,
> + "%u zones of %zu sectors\n",
> + sdkp->nr_zones,
> + sdkp->zone_sectors);
> + else
> + sd_printk(KERN_NOTICE, sdkp,
> + "%u zones of %zu sectors "
> + "+ 1 runt zone\n",
> + sdkp->nr_zones - 1,
> + sdkp->zone_sectors);
> + }
> +}
> +
> +void sd_zbc_remove(struct scsi_disk *sdkp)
> +{
> +
> + sd_config_discard(sdkp, SD_LBP_DISABLE);
> +
> + if (sdkp->zone_work_q) {
> + drain_workqueue(sdkp->zone_work_q);
> + destroy_workqueue(sdkp->zone_work_q);
> + sdkp->zone_work_q =3D NULL;
> + blk_drop_zones(sdkp->disk->queue);
> + }
> +}
> +
> diff --git a/include/scsi/scsi_proto.h b/include/scsi/scsi_proto.h
> index d1defd1..6ba66e0 100644
> --- a/include/scsi/scsi_proto.h
> +++ b/include/scsi/scsi_proto.h
> @@ -299,4 +299,21 @@ struct scsi_lun {
> #define SCSI_ACCESS_STATE_MASK 0x0f
> #define SCSI_ACCESS_STATE_PREFERRED 0x80
>
> +/* Reporting options for REPORT ZONES */
> +enum zbc_zone_reporting_options {
> + ZBC_ZONE_REPORTING_OPTION_ALL =3D 0,
> + ZBC_ZONE_REPORTING_OPTION_EMPTY,
> + ZBC_ZONE_REPORTING_OPTION_IMPLICIT_OPEN,
> + ZBC_ZONE_REPORTING_OPTION_EXPLICIT_OPEN,
> + ZBC_ZONE_REPORTING_OPTION_CLOSED,
> + ZBC_ZONE_REPORTING_OPTION_FULL,
> + ZBC_ZONE_REPORTING_OPTION_READONLY,
> + ZBC_ZONE_REPORTING_OPTION_OFFLINE,
> + ZBC_ZONE_REPORTING_OPTION_NEED_RESET_WP =3D 0x10,
> + ZBC_ZONE_REPORTING_OPTION_NON_SEQWRITE,
> + ZBC_ZONE_REPORTING_OPTION_NON_WP =3D 0x3f,
> +};
> +
> +#define ZBC_REPORT_ZONE_PARTIAL 0x80
> +
Why don't we expose these enums via uapi?
> #endif /* _SCSI_PROTO_H_ */
> --
> 2.7.4
>
> Western Digital Corporation (and its subsidiaries) E-mail Confidentiality=
Notice & Disclaimer:
>
> This e-mail and any files transmitted with it may contain confidential or=
legally privileged information of WDC and/or its affiliates, and are inten=
ded solely for the use of the individual or entity to which they are addres=
sed. If you are not the intended recipient, any disclosure, copying, distri=
bution or any action taken or omitted to be taken in reliance on it, is pro=
hibited. If you have received this e-mail in error, please notify the sende=
r immediately and delete the e-mail in its entirety from your system.
>
--=20
Shaun Tancheff
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 8/9] sd: Implement support for ZBC devices
@ 2016-09-20 5:40 ` Shaun Tancheff
0 siblings, 0 replies; 36+ messages in thread
From: Shaun Tancheff @ 2016-09-20 5:40 UTC (permalink / raw)
To: Damien Le Moal
Cc: linux-scsi, linux-block, Martin K. Petersen, Jens Axboe,
Hannes Reinecke, Hannes Reinecke
On Mon, Sep 19, 2016 at 4:27 PM, Damien Le Moal <damien.lemoal@hgst.com> wrote:
> From: Hannes Reinecke <hare@suse.com>
>
> Implement ZBC support functions to setup zoned disks and fill the
> block device zone information tree during the device scan. The
> zone information tree is also always updated on disk revalidation.
> This adds support for the REQ_OP_ZONE* operations and also implements
> the new RESET_WP provisioning mode so that discard requests can be
> mapped to the RESET WRITE POINTER command for devices with a constant
> zone size.
>
> The capacity read of the device triggers the zone information read
> for zoned block devices. As this needs the device zone model, the
> the call to sd_read_capacity is moved after the call to
> sd_read_block_characteristics so that host-aware devices are
> properlly initialized. The call to sd_zbc_read_zones in
> sd_read_capacity may change the device capacity obtained with
> the sd_read_capacity_16 function for devices reporting only the
> capacity of conventional zones at the beginning of the LBA range
> (i.e. devices with rc_basis et to 0).
>
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
> ---
> drivers/scsi/Makefile | 1 +
> drivers/scsi/sd.c | 147 ++++--
> drivers/scsi/sd.h | 68 +++
> drivers/scsi/sd_zbc.c | 1097 +++++++++++++++++++++++++++++++++++++++++++++
> include/scsi/scsi_proto.h | 17 +
> 5 files changed, 1304 insertions(+), 26 deletions(-)
> create mode 100644 drivers/scsi/sd_zbc.c
>
> diff --git a/drivers/scsi/Makefile b/drivers/scsi/Makefile
> index d539798..fabcb6d 100644
> --- a/drivers/scsi/Makefile
> +++ b/drivers/scsi/Makefile
> @@ -179,6 +179,7 @@ hv_storvsc-y := storvsc_drv.o
>
> sd_mod-objs := sd.o
> sd_mod-$(CONFIG_BLK_DEV_INTEGRITY) += sd_dif.o
> +sd_mod-$(CONFIG_BLK_DEV_ZONED) += sd_zbc.o
>
> sr_mod-objs := sr.o sr_ioctl.o sr_vendor.o
> ncr53c8xx-flags-$(CONFIG_SCSI_ZALON) \
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index d3e852a..46b8b78 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -92,6 +92,7 @@ MODULE_ALIAS_BLOCKDEV_MAJOR(SCSI_DISK15_MAJOR);
> MODULE_ALIAS_SCSI_DEVICE(TYPE_DISK);
> MODULE_ALIAS_SCSI_DEVICE(TYPE_MOD);
> MODULE_ALIAS_SCSI_DEVICE(TYPE_RBC);
> +MODULE_ALIAS_SCSI_DEVICE(TYPE_ZBC);
>
> #if !defined(CONFIG_DEBUG_BLOCK_EXT_DEVT)
> #define SD_MINORS 16
> @@ -99,7 +100,6 @@ MODULE_ALIAS_SCSI_DEVICE(TYPE_RBC);
> #define SD_MINORS 0
> #endif
>
> -static void sd_config_discard(struct scsi_disk *, unsigned int);
> static void sd_config_write_same(struct scsi_disk *);
> static int sd_revalidate_disk(struct gendisk *);
> static void sd_unlock_native_capacity(struct gendisk *disk);
> @@ -162,7 +162,7 @@ cache_type_store(struct device *dev, struct device_attribute *attr,
> static const char temp[] = "temporary ";
> int len;
>
> - if (sdp->type != TYPE_DISK)
> + if (sdp->type != TYPE_DISK && sdp->type != TYPE_ZBC)
> /* no cache control on RBC devices; theoretically they
> * can do it, but there's probably so many exceptions
> * it's not worth the risk */
> @@ -261,7 +261,7 @@ allow_restart_store(struct device *dev, struct device_attribute *attr,
> if (!capable(CAP_SYS_ADMIN))
> return -EACCES;
>
> - if (sdp->type != TYPE_DISK)
> + if (sdp->type != TYPE_DISK && sdp->type != TYPE_ZBC)
> return -EINVAL;
>
> sdp->allow_restart = simple_strtoul(buf, NULL, 10);
> @@ -369,6 +369,7 @@ static const char *lbp_mode[] = {
> [SD_LBP_WS16] = "writesame_16",
> [SD_LBP_WS10] = "writesame_10",
> [SD_LBP_ZERO] = "writesame_zero",
> + [SD_ZBC_RESET_WP] = "reset_wp",
> [SD_LBP_DISABLE] = "disabled",
> };
>
> @@ -391,6 +392,13 @@ provisioning_mode_store(struct device *dev, struct device_attribute *attr,
> if (!capable(CAP_SYS_ADMIN))
> return -EACCES;
>
> + if (sdkp->zoned == 1 || sdp->type == TYPE_ZBC) {
> + if (!strncmp(buf, lbp_mode[SD_ZBC_RESET_WP], 20)) {
> + sd_config_discard(sdkp, SD_ZBC_RESET_WP);
> + return count;
> + }
> + return -EINVAL;
> + }
> if (sdp->type != TYPE_DISK)
> return -EINVAL;
>
> @@ -458,7 +466,7 @@ max_write_same_blocks_store(struct device *dev, struct device_attribute *attr,
> if (!capable(CAP_SYS_ADMIN))
> return -EACCES;
>
> - if (sdp->type != TYPE_DISK)
> + if (sdp->type != TYPE_DISK && sdp->type != TYPE_ZBC)
> return -EINVAL;
>
> err = kstrtoul(buf, 10, &max);
> @@ -631,7 +639,7 @@ static unsigned char sd_setup_protect_cmnd(struct scsi_cmnd *scmd,
> return protect;
> }
>
> -static void sd_config_discard(struct scsi_disk *sdkp, unsigned int mode)
> +void sd_config_discard(struct scsi_disk *sdkp, unsigned int mode)
> {
> struct request_queue *q = sdkp->disk->queue;
> unsigned int logical_block_size = sdkp->device->sector_size;
> @@ -683,6 +691,11 @@ static void sd_config_discard(struct scsi_disk *sdkp, unsigned int mode)
> q->limits.discard_zeroes_data = sdkp->lbprz;
> break;
>
> + case SD_ZBC_RESET_WP:
> + max_blocks = min_not_zero(sdkp->max_unmap_blocks,
> + (u32)SD_MAX_WS16_BLOCKS);
> + break;
> +
> case SD_LBP_ZERO:
> max_blocks = min_not_zero(sdkp->max_ws_blocks,
> (u32)SD_MAX_WS10_BLOCKS);
> @@ -711,16 +724,20 @@ static int sd_setup_discard_cmnd(struct scsi_cmnd *cmd)
> unsigned int nr_sectors = blk_rq_sectors(rq);
> unsigned int nr_bytes = blk_rq_bytes(rq);
> unsigned int len;
> - int ret;
> + int ret = BLKPREP_OK;
> char *buf;
> - struct page *page;
> + struct page *page = NULL;
>
> sector >>= ilog2(sdp->sector_size) - 9;
> nr_sectors >>= ilog2(sdp->sector_size) - 9;
>
> - page = alloc_page(GFP_ATOMIC | __GFP_ZERO);
> - if (!page)
> - return BLKPREP_DEFER;
> + if (sdkp->provisioning_mode != SD_ZBC_RESET_WP) {
> + page = alloc_page(GFP_ATOMIC | __GFP_ZERO);
> + if (!page)
> + return BLKPREP_DEFER;
> + }
> +
> + rq->completion_data = page;
>
> switch (sdkp->provisioning_mode) {
> case SD_LBP_UNMAP:
> @@ -760,12 +777,19 @@ static int sd_setup_discard_cmnd(struct scsi_cmnd *cmd)
> len = sdkp->device->sector_size;
> break;
>
> + case SD_ZBC_RESET_WP:
> + ret = sd_zbc_setup_reset_cmnd(cmd);
> + if (ret != BLKPREP_OK)
> + goto out;
> + /* Reset Write Pointer doesn't have a payload */
> + len = 0;
> + break;
> +
> default:
> ret = BLKPREP_INVALID;
> goto out;
> }
>
> - rq->completion_data = page;
> rq->timeout = SD_TIMEOUT;
>
> cmd->transfersize = len;
> @@ -779,13 +803,17 @@ static int sd_setup_discard_cmnd(struct scsi_cmnd *cmd)
> * discarded on disk. This allows us to report completion on the full
> * amount of blocks described by the request.
> */
> - blk_add_request_payload(rq, page, 0, len);
> - ret = scsi_init_io(cmd);
> + if (len) {
> + blk_add_request_payload(rq, page, 0, len);
> + ret = scsi_init_io(cmd);
> + }
> rq->__data_len = nr_bytes;
>
> out:
> - if (ret != BLKPREP_OK)
> + if (page && ret != BLKPREP_OK) {
> + rq->completion_data = NULL;
> __free_page(page);
> + }
> return ret;
> }
>
> @@ -843,6 +871,13 @@ static int sd_setup_write_same_cmnd(struct scsi_cmnd *cmd)
>
> BUG_ON(bio_offset(bio) || bio_iovec(bio).bv_len != sdp->sector_size);
>
> + if (sdkp->zoned == 1 || sdp->type == TYPE_ZBC) {
> + /* sd_zbc_setup_read_write uses block layer sector units */
> + ret = sd_zbc_setup_read_write(sdkp, rq, sector, &nr_sectors);
> + if (ret != BLKPREP_OK)
> + return ret;
> + }
> +
> sector >>= ilog2(sdp->sector_size) - 9;
> nr_sectors >>= ilog2(sdp->sector_size) - 9;
>
> @@ -962,6 +997,13 @@ static int sd_setup_read_write_cmnd(struct scsi_cmnd *SCpnt)
> SCSI_LOG_HLQUEUE(2, scmd_printk(KERN_INFO, SCpnt, "block=%llu\n",
> (unsigned long long)block));
>
> + if (sdkp->zoned == 1 || sdp->type == TYPE_ZBC) {
> + /* sd_zbc_setup_read_write uses block layer sector units */
> + ret = sd_zbc_setup_read_write(sdkp, rq, block, &this_count);
> + if (ret != BLKPREP_OK)
> + goto out;
> + }
> +
> /*
> * If we have a 1K hardware sectorsize, prevent access to single
> * 512 byte sectors. In theory we could handle this - in fact
> @@ -1148,6 +1190,16 @@ static int sd_init_command(struct scsi_cmnd *cmd)
> case REQ_OP_READ:
> case REQ_OP_WRITE:
> return sd_setup_read_write_cmnd(cmd);
> + case REQ_OP_ZONE_REPORT:
> + return sd_zbc_setup_report_cmnd(cmd);
> + case REQ_OP_ZONE_RESET:
> + return sd_zbc_setup_reset_cmnd(cmd);
> + case REQ_OP_ZONE_OPEN:
> + return sd_zbc_setup_open_cmnd(cmd);
> + case REQ_OP_ZONE_CLOSE:
> + return sd_zbc_setup_close_cmnd(cmd);
> + case REQ_OP_ZONE_FINISH:
> + return sd_zbc_setup_finish_cmnd(cmd);
> default:
> BUG();
> }
> @@ -1157,7 +1209,8 @@ static void sd_uninit_command(struct scsi_cmnd *SCpnt)
> {
> struct request *rq = SCpnt->request;
>
> - if (req_op(rq) == REQ_OP_DISCARD)
> + if (req_op(rq) == REQ_OP_DISCARD &&
> + rq->completion_data)
> __free_page(rq->completion_data);
>
> if (SCpnt->cmnd != rq->cmd) {
> @@ -1778,8 +1831,16 @@ static int sd_done(struct scsi_cmnd *SCpnt)
> int sense_deferred = 0;
> unsigned char op = SCpnt->cmnd[0];
> unsigned char unmap = SCpnt->cmnd[1] & 8;
> + unsigned char sa = SCpnt->cmnd[1] & 0xf;
>
> - if (req_op(req) == REQ_OP_DISCARD || req_op(req) == REQ_OP_WRITE_SAME) {
> + switch(req_op(req)) {
> + case REQ_OP_DISCARD:
> + case REQ_OP_WRITE_SAME:
> + case REQ_OP_ZONE_REPORT:
> + case REQ_OP_ZONE_RESET:
> + case REQ_OP_ZONE_OPEN:
> + case REQ_OP_ZONE_CLOSE:
> + case REQ_OP_ZONE_FINISH:
> if (!result) {
> good_bytes = blk_rq_bytes(req);
> scsi_set_resid(SCpnt, 0);
> @@ -1787,6 +1848,7 @@ static int sd_done(struct scsi_cmnd *SCpnt)
> good_bytes = 0;
> scsi_set_resid(SCpnt, blk_rq_bytes(req));
> }
> + break;
> }
>
> if (result) {
> @@ -1829,6 +1891,10 @@ static int sd_done(struct scsi_cmnd *SCpnt)
> case UNMAP:
> sd_config_discard(sdkp, SD_LBP_DISABLE);
> break;
> + case ZBC_OUT:
> + if (sa == ZO_RESET_WRITE_POINTER)
> + sd_config_discard(sdkp, SD_LBP_DISABLE);
> + break;
> case WRITE_SAME_16:
> case WRITE_SAME:
> if (unmap)
> @@ -1847,7 +1913,11 @@ static int sd_done(struct scsi_cmnd *SCpnt)
> default:
> break;
> }
> +
> out:
> + if (sdkp->zoned == 1 || sdkp->device->type == TYPE_ZBC)
> + sd_zbc_done(SCpnt, &sshdr);
> +
> SCSI_LOG_HLCOMPLETE(1, scmd_printk(KERN_INFO, SCpnt,
> "sd_done: completed %d of %d bytes\n",
> good_bytes, scsi_bufflen(SCpnt)));
> @@ -1982,7 +2052,6 @@ sd_spinup_disk(struct scsi_disk *sdkp)
> }
> }
>
> -
> /*
> * Determine whether disk supports Data Integrity Field.
> */
> @@ -2132,6 +2201,9 @@ static int read_capacity_16(struct scsi_disk *sdkp, struct scsi_device *sdp,
> /* Logical blocks per physical block exponent */
> sdkp->physical_block_size = (1 << (buffer[13] & 0xf)) * sector_size;
>
> + /* RC basis */
> + sdkp->rc_basis = (buffer[12] >> 4) & 0x3;
> +
> /* Lowest aligned logical block */
> alignment = ((buffer[14] & 0x3f) << 8 | buffer[15]) * sector_size;
> blk_queue_alignment_offset(sdp->request_queue, alignment);
> @@ -2322,6 +2394,11 @@ got_data:
> sector_size = 512;
> }
> blk_queue_logical_block_size(sdp->request_queue, sector_size);
> + blk_queue_physical_block_size(sdp->request_queue,
> + sdkp->physical_block_size);
> + sdkp->device->sector_size = sector_size;
> +
> + sd_zbc_read_zones(sdkp, buffer);
>
> {
> char cap_str_2[10], cap_str_10[10];
> @@ -2348,9 +2425,6 @@ got_data:
> if (sdkp->capacity > 0xffffffff)
> sdp->use_16_for_rw = 1;
>
> - blk_queue_physical_block_size(sdp->request_queue,
> - sdkp->physical_block_size);
> - sdkp->device->sector_size = sector_size;
> }
>
> /* called with buffer of length 512 */
> @@ -2612,7 +2686,7 @@ static void sd_read_app_tag_own(struct scsi_disk *sdkp, unsigned char *buffer)
> struct scsi_mode_data data;
> struct scsi_sense_hdr sshdr;
>
> - if (sdp->type != TYPE_DISK)
> + if (sdp->type != TYPE_DISK && sdp->type != TYPE_ZBC)
> return;
>
> if (sdkp->protection_type == 0)
> @@ -2719,6 +2793,7 @@ static void sd_read_block_limits(struct scsi_disk *sdkp)
> */
> static void sd_read_block_characteristics(struct scsi_disk *sdkp)
> {
> + struct request_queue *q = sdkp->disk->queue;
> unsigned char *buffer;
> u16 rot;
> const int vpd_len = 64;
> @@ -2733,10 +2808,21 @@ static void sd_read_block_characteristics(struct scsi_disk *sdkp)
> rot = get_unaligned_be16(&buffer[4]);
>
> if (rot == 1) {
> - queue_flag_set_unlocked(QUEUE_FLAG_NONROT, sdkp->disk->queue);
> - queue_flag_clear_unlocked(QUEUE_FLAG_ADD_RANDOM, sdkp->disk->queue);
> + queue_flag_set_unlocked(QUEUE_FLAG_NONROT, q);
> + queue_flag_clear_unlocked(QUEUE_FLAG_ADD_RANDOM, q);
> }
>
> + sdkp->zoned = (buffer[8] >> 4) & 3;
> + if (sdkp->zoned == 1)
> + q->limits.zoned = BLK_ZONED_HA;
> + else if (sdkp->device->type == TYPE_ZBC)
> + q->limits.zoned = BLK_ZONED_HM;
> + else
> + q->limits.zoned = BLK_ZONED_NONE;
> + if (blk_queue_zoned(q) && sdkp->first_scan)
> + sd_printk(KERN_NOTICE, sdkp, "Host-%s zoned block device\n",
> + q->limits.zoned == BLK_ZONED_HM ? "managed" : "aware");
> +
> out:
> kfree(buffer);
> }
> @@ -2835,14 +2921,14 @@ static int sd_revalidate_disk(struct gendisk *disk)
> * react badly if we do.
> */
> if (sdkp->media_present) {
> - sd_read_capacity(sdkp, buffer);
> -
> if (scsi_device_supports_vpd(sdp)) {
> sd_read_block_provisioning(sdkp);
> sd_read_block_limits(sdkp);
> sd_read_block_characteristics(sdkp);
> }
>
> + sd_read_capacity(sdkp, buffer);
> +
> sd_read_write_protect_flag(sdkp, buffer);
> sd_read_cache_type(sdkp, buffer);
> sd_read_app_tag_own(sdkp, buffer);
> @@ -3040,9 +3126,16 @@ static int sd_probe(struct device *dev)
>
> scsi_autopm_get_device(sdp);
> error = -ENODEV;
> - if (sdp->type != TYPE_DISK && sdp->type != TYPE_MOD && sdp->type != TYPE_RBC)
> + if (sdp->type != TYPE_DISK &&
> + sdp->type != TYPE_ZBC &&
> + sdp->type != TYPE_MOD &&
> + sdp->type != TYPE_RBC)
> goto out;
>
> +#ifndef CONFIG_BLK_DEV_ZONED
> + if (sdp->type == TYPE_ZBC)
> + goto out;
> +#endif
> SCSI_LOG_HLQUEUE(3, sdev_printk(KERN_INFO, sdp,
> "sd_probe\n"));
>
> @@ -3146,6 +3239,8 @@ static int sd_remove(struct device *dev)
> del_gendisk(sdkp->disk);
> sd_shutdown(dev);
>
> + sd_zbc_remove(sdkp);
> +
> blk_register_region(devt, SD_MINORS, NULL,
> sd_default_probe, NULL, NULL);
>
> diff --git a/drivers/scsi/sd.h b/drivers/scsi/sd.h
> index 765a6f1..3452871 100644
> --- a/drivers/scsi/sd.h
> +++ b/drivers/scsi/sd.h
> @@ -56,6 +56,7 @@ enum {
> SD_LBP_WS16, /* Use WRITE SAME(16) with UNMAP bit */
> SD_LBP_WS10, /* Use WRITE SAME(10) with UNMAP bit */
> SD_LBP_ZERO, /* Use WRITE SAME(10) with zero payload */
> + SD_ZBC_RESET_WP, /* Use RESET WRITE POINTER */
> SD_LBP_DISABLE, /* Discard disabled due to failed cmd */
> };
>
Can we have adding SD_ZBC_RESET_WP as a separate patch?
> @@ -64,6 +65,11 @@ struct scsi_disk {
> struct scsi_device *device;
> struct device dev;
> struct gendisk *disk;
> +#ifdef CONFIG_BLK_DEV_ZONED
> + struct workqueue_struct *zone_work_q;
> + sector_t zone_sectors;
> + unsigned int nr_zones;
> +#endif
> atomic_t openers;
> sector_t capacity; /* size in logical blocks */
> u32 max_xfer_blocks;
> @@ -94,6 +100,8 @@ struct scsi_disk {
> unsigned lbpvpd : 1;
> unsigned ws10 : 1;
> unsigned ws16 : 1;
> + unsigned rc_basis: 2;
> + unsigned zoned: 2;
> };
> #define to_scsi_disk(obj) container_of(obj,struct scsi_disk,dev)
>
> @@ -156,6 +164,13 @@ static inline unsigned int logical_to_bytes(struct scsi_device *sdev, sector_t b
> return blocks * sdev->sector_size;
> }
>
> +static inline sector_t sectors_to_logical(struct scsi_device *sdev, sector_t sector)
> +{
> + return sector >> (ilog2(sdev->sector_size) - 9);
> +}
> +
> +extern void sd_config_discard(struct scsi_disk *, unsigned int);
> +
> /*
> * A DIF-capable target device can be formatted with different
> * protection schemes. Currently 0 through 3 are defined:
> @@ -269,4 +284,57 @@ static inline void sd_dif_complete(struct scsi_cmnd *cmd, unsigned int a)
>
> #endif /* CONFIG_BLK_DEV_INTEGRITY */
>
> +#ifdef CONFIG_BLK_DEV_ZONED
> +
> +extern void sd_zbc_read_zones(struct scsi_disk *, char *);
> +extern void sd_zbc_remove(struct scsi_disk *);
> +extern int sd_zbc_setup_read_write(struct scsi_disk *, struct request *,
> + sector_t, unsigned int *);
> +extern int sd_zbc_setup_report_cmnd(struct scsi_cmnd *);
> +extern int sd_zbc_setup_reset_cmnd(struct scsi_cmnd *);
> +extern int sd_zbc_setup_open_cmnd(struct scsi_cmnd *);
> +extern int sd_zbc_setup_close_cmnd(struct scsi_cmnd *);
> +extern int sd_zbc_setup_finish_cmnd(struct scsi_cmnd *);
> +extern void sd_zbc_done(struct scsi_cmnd *, struct scsi_sense_hdr *);
> +
> +#else /* CONFIG_BLK_DEV_ZONED */
> +
> +static inline void sd_zbc_read_zones(struct scsi_disk *sdkp,
> + unsigned char *buf) {}
> +static inline void sd_zbc_remove(struct scsi_disk *sdkp) {}
> +
> +static inline int sd_zbc_setup_read_write(struct scsi_disk *sdkp,
> + struct request *rq, sector_t sector,
> + unsigned int *num_sectors)
> +{
> + /* Let the drive fail requests */
> + return BLKPREP_OK;
> +}
> +
> +static inline int sd_zbc_setup_report_cmnd(struct scsi_cmnd *cmd)
> +{
> + return BLKPREP_KILL;
> +}
> +static inline int sd_zbc_setup_reset_cmnd(struct scsi_cmnd *cmd)
> +{
> + return BLKPREP_KILL;
> +}
> +static inline int sd_zbc_setup_open_cmnd(struct scsi_cmnd *cmd)
> +{
> + return BLKPREP_KILL;
> +}
> +static inline int sd_zbc_setup_close_cmnd(struct scsi_cmnd *cmd)
> +{
> + return BLKPREP_KILL;
> +}
> +static inline int sd_zbc_setup_finish_cmnd(struct scsi_cmnd *cmd)
> +{
> + return BLKPREP_KILL;
> +}
> +
> +static inline void sd_zbc_done(struct scsi_cmnd *cmd,
> + struct scsi_sense_hdr *sshdr) {}
> +
> +#endif /* CONFIG_BLK_DEV_ZONED */
> +
> #endif /* _SCSI_DISK_H */
> diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
> new file mode 100644
> index 0000000..ec9c3fc
> --- /dev/null
> +++ b/drivers/scsi/sd_zbc.c
> @@ -0,0 +1,1097 @@
> +/*
> + * SCSI Zoned Block commands
> + *
> + * Copyright (C) 2014-2015 SUSE Linux GmbH
> + * Written by: Hannes Reinecke <hare@suse.de>
> + * Modified by: Damien Le Moal <damien.lemoal@hgst.com>
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License version
> + * 2 as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful, but
> + * WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + * General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; see the file COPYING. If not, write to
> + * the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139,
> + * USA.
> + *
> + */
> +
> +#include <linux/blkdev.h>
> +#include <linux/rbtree.h>
> +
> +#include <asm/unaligned.h>
> +
> +#include <scsi/scsi.h>
> +#include <scsi/scsi_cmnd.h>
> +#include <scsi/scsi_dbg.h>
> +#include <scsi/scsi_device.h>
> +#include <scsi/scsi_driver.h>
> +#include <scsi/scsi_host.h>
> +#include <scsi/scsi_eh.h>
> +
> +#include "sd.h"
> +#include "scsi_priv.h"
> +
> +enum zbc_zone_type {
> + ZBC_ZONE_TYPE_CONV = 0x1,
> + ZBC_ZONE_TYPE_SEQWRITE_REQ,
> + ZBC_ZONE_TYPE_SEQWRITE_PREF,
> + ZBC_ZONE_TYPE_RESERVED,
> +};
> +
> +enum zbc_zone_cond {
> + ZBC_ZONE_COND_NO_WP,
> + ZBC_ZONE_COND_EMPTY,
> + ZBC_ZONE_COND_IMP_OPEN,
> + ZBC_ZONE_COND_EXP_OPEN,
> + ZBC_ZONE_COND_CLOSED,
> + ZBC_ZONE_COND_READONLY = 0xd,
> + ZBC_ZONE_COND_FULL,
> + ZBC_ZONE_COND_OFFLINE,
> +};
> +
> +#define SD_ZBC_BUF_SIZE 131072
> +
> +#define sd_zbc_debug(sdkp, fmt, args...) \
> + pr_debug("%s %s [%s]: " fmt, \
> + dev_driver_string(&(sdkp)->device->sdev_gendev), \
> + dev_name(&(sdkp)->device->sdev_gendev), \
> + (sdkp)->disk->disk_name, ## args)
> +
> +#define sd_zbc_debug_ratelimit(sdkp, fmt, args...) \
> + do { \
> + if (printk_ratelimit()) \
> + sd_zbc_debug(sdkp, fmt, ## args); \
> + } while( 0 )
> +
> +#define sd_zbc_err(sdkp, fmt, args...) \
> + pr_err("%s %s [%s]: " fmt, \
> + dev_driver_string(&(sdkp)->device->sdev_gendev), \
> + dev_name(&(sdkp)->device->sdev_gendev), \
> + (sdkp)->disk->disk_name, ## args)
> +
> +struct zbc_zone_work {
> + struct work_struct zone_work;
> + struct scsi_disk *sdkp;
> + sector_t sector;
> + sector_t nr_sects;
> + bool init;
> + unsigned int nr_zones;
> +};
> +
> +struct blk_zone *zbc_desc_to_zone(struct scsi_disk *sdkp, unsigned char *rec)
> +{
> + struct blk_zone *zone;
> +
> + zone = kzalloc(sizeof(struct blk_zone), GFP_KERNEL);
> + if (!zone)
> + return NULL;
> +
> + /* Zone type */
> + switch(rec[0] & 0x0f) {
> + case ZBC_ZONE_TYPE_CONV:
> + case ZBC_ZONE_TYPE_SEQWRITE_REQ:
> + case ZBC_ZONE_TYPE_SEQWRITE_PREF:
> + zone->type = rec[0] & 0x0f;
> + break;
> + default:
> + zone->type = BLK_ZONE_TYPE_UNKNOWN;
> + break;
> + }
> +
> + /* Zone condition */
> + zone->cond = (rec[1] >> 4) & 0xf;
> + if (rec[1] & 0x01)
> + zone->reset = 1;
> + if (rec[1] & 0x02)
> + zone->non_seq = 1;
> +
> + /* Zone start sector and length */
> + zone->len = logical_to_sectors(sdkp->device,
> + get_unaligned_be64(&rec[8]));
> + zone->start = logical_to_sectors(sdkp->device,
> + get_unaligned_be64(&rec[16]));
> +
> + /* Zone write pointer */
> + if (blk_zone_is_empty(zone) &&
> + zone->wp != zone->start)
> + zone->wp = zone->start;
> + else if (blk_zone_is_full(zone))
> + zone->wp = zone->start + zone->len;
> + else if (blk_zone_is_seq(zone))
> + zone->wp = logical_to_sectors(sdkp->device,
> + get_unaligned_be64(&rec[24]));
> + else
> + zone->wp = (sector_t)-1;
> +
> + return zone;
> +}
> +
> +static int zbc_parse_zones(struct scsi_disk *sdkp, unsigned char *buf,
> + unsigned int buf_len, sector_t *next_sector)
> +{
> + struct request_queue *q = sdkp->disk->queue;
> + sector_t capacity = logical_to_sectors(sdkp->device, sdkp->capacity);
> + unsigned char *rec = buf;
> + unsigned int zone_len, list_length;
> +
> + /* Parse REPORT ZONES header */
> + list_length = get_unaligned_be32(&buf[0]);
> + rec = buf + 64;
> + list_length += 64;
> +
> + if (list_length < buf_len)
> + buf_len = list_length;
> +
> + /* Parse REPORT ZONES zone descriptors */
> + *next_sector = capacity;
> + while (rec < buf + buf_len) {
> +
> + struct blk_zone *new, *old;
> +
> + new = zbc_desc_to_zone(sdkp, rec);
> + if (!new)
> + return -ENOMEM;
> +
> + zone_len = new->len;
> + *next_sector = new->start + zone_len;
> +
> + old = blk_insert_zone(q, new);
> + if (old) {
> + blk_lock_zone(old);
> +
> + /*
> + * Always update the zone state flags and the zone
> + * offline and read-only condition as the drive may
> + * change those independently of the commands being
> + * executed
> + */
> + old->reset = new->reset;
> + old->non_seq = new->non_seq;
> + if (blk_zone_is_offline(new) ||
> + blk_zone_is_readonly(new))
> + old->cond = new->cond;
> +
> + if (blk_zone_in_update(old)) {
> + old->cond = new->cond;
> + old->wp = new->wp;
> + blk_clear_zone_update(old);
> + }
> +
> + blk_unlock_zone(old);
> +
> + kfree(new);
> + }
> +
> + rec += 64;
> +
> + }
> +
> + return 0;
> +}
> +
> +/**
> + * sd_zbc_report_zones - Issue a REPORT ZONES scsi command
> + * @sdkp: SCSI disk to which the command should be send
> + * @buffer: response buffer
> + * @bufflen: length of @buffer
> + * @start_sector: logical sector for the zone information should be reported
> + * @option: reporting option to be used
> + * @partial: flag to set the 'partial' bit for report zones command
> + */
> +int sd_zbc_report_zones(struct scsi_disk *sdkp, unsigned char *buffer,
> + int bufflen, sector_t start_sector,
> + enum zbc_zone_reporting_options option, bool partial)
> +{
> + struct scsi_device *sdp = sdkp->device;
> + const int timeout = sdp->request_queue->rq_timeout;
> + struct scsi_sense_hdr sshdr;
> + sector_t start_lba = sectors_to_logical(sdkp->device, start_sector);
> + unsigned char cmd[16];
> + int result;
> +
> + if (!scsi_device_online(sdp))
> + return -ENODEV;
> +
> + sd_zbc_debug(sdkp, "REPORT ZONES lba %zu len %d\n",
> + start_lba, bufflen);
> +
> + memset(cmd, 0, 16);
> + cmd[0] = ZBC_IN;
> + cmd[1] = ZI_REPORT_ZONES;
> + put_unaligned_be64(start_lba, &cmd[2]);
> + put_unaligned_be32(bufflen, &cmd[10]);
> + cmd[14] = (partial ? ZBC_REPORT_ZONE_PARTIAL : 0) | option;
> + memset(buffer, 0, bufflen);
> +
> + result = scsi_execute_req(sdp, cmd, DMA_FROM_DEVICE,
> + buffer, bufflen, &sshdr,
> + timeout, SD_MAX_RETRIES, NULL);
> +
> + if (result) {
> + sd_zbc_err(sdkp,
> + "REPORT ZONES lba %zu failed with %d/%d\n",
> + start_lba, host_byte(result), driver_byte(result));
> + return -EIO;
> + }
> +
> + return 0;
> +}
> +
> +/**
> + * Set or clear the update flag of all zones contained
> + * in the range sector..sector+nr_sects.
> + * Return the number of zones marked/cleared.
> + */
> +static int __sd_zbc_zones_updating(struct scsi_disk *sdkp,
> + sector_t sector, sector_t nr_sects,
> + bool set)
> +{
> + struct request_queue *q = sdkp->disk->queue;
> + struct blk_zone *zone;
> + struct rb_node *node;
> + unsigned long flags;
> + int nr_zones = 0;
> +
> + if (!nr_sects) {
> + /* All zones */
> + sector = 0;
> + nr_sects = logical_to_sectors(sdkp->device, sdkp->capacity);
> + }
> +
> + spin_lock_irqsave(&q->zones_lock, flags);
> + for (node = rb_first(&q->zones); node && nr_sects; node = rb_next(node)) {
> + zone = rb_entry(node, struct blk_zone, node);
> + if (sector < zone->start || sector >= (zone->start + zone->len))
> + continue;
> + if (set) {
> + if (!test_and_set_bit_lock(BLK_ZONE_IN_UPDATE, &zone->flags))
> + nr_zones++;
> + } else if (test_and_clear_bit(BLK_ZONE_IN_UPDATE, &zone->flags)) {
> + wake_up_bit(&zone->flags, BLK_ZONE_IN_UPDATE);
> + nr_zones++;
> + }
> + sector = zone->start + zone->len;
> + if (nr_sects <= zone->len)
> + nr_sects = 0;
> + else
> + nr_sects -= zone->len;
> + }
> + spin_unlock_irqrestore(&q->zones_lock, flags);
> +
> + return nr_zones;
> +}
> +
> +static inline int sd_zbc_set_zones_updating(struct scsi_disk *sdkp,
> + sector_t sector, sector_t nr_sects)
> +{
> + return __sd_zbc_zones_updating(sdkp, sector, nr_sects, true);
> +}
> +
> +static inline int sd_zbc_clear_zones_updating(struct scsi_disk *sdkp,
> + sector_t sector, sector_t nr_sects)
> +{
> + return __sd_zbc_zones_updating(sdkp, sector, nr_sects, false);
> +}
> +
> +static void sd_zbc_start_queue(struct request_queue *q)
> +{
> + unsigned long flags;
> +
> + if (q->mq_ops) {
> + blk_mq_start_hw_queues(q);
> + } else {
> + spin_lock_irqsave(q->queue_lock, flags);
> + blk_start_queue(q);
> + spin_unlock_irqrestore(q->queue_lock, flags);
> + }
> +}
> +
> +static void sd_zbc_update_zone_work(struct work_struct *work)
> +{
> + struct zbc_zone_work *zwork =
> + container_of(work, struct zbc_zone_work, zone_work);
> + struct scsi_disk *sdkp = zwork->sdkp;
> + sector_t capacity = logical_to_sectors(sdkp->device, sdkp->capacity);
> + struct request_queue *q = sdkp->disk->queue;
> + sector_t end_sector, sector = zwork->sector;
> + unsigned int bufsize;
> + unsigned char *buf;
> + int ret = -ENOMEM;
> +
> + /* Get a buffer */
> + if (!zwork->nr_zones) {
> + bufsize = SD_ZBC_BUF_SIZE;
> + } else {
> + bufsize = (zwork->nr_zones + 1) * 64;
> + if (bufsize < 512)
> + bufsize = 512;
> + else if (bufsize > SD_ZBC_BUF_SIZE)
> + bufsize = SD_ZBC_BUF_SIZE;
> + else
> + bufsize = (bufsize + 511) & ~511;
> + }
> + buf = kmalloc(bufsize, GFP_KERNEL | GFP_DMA);
> + if (!buf) {
> + sd_zbc_err(sdkp, "Failed to allocate zone report buffer\n");
> + goto done_free;
> + }
> +
> + /* Process sector range */
> + end_sector = zwork->sector + zwork->nr_sects;
> + while(sector < min(end_sector, capacity)) {
> +
> + /* Get zone report */
> + ret = sd_zbc_report_zones(sdkp, buf, bufsize, sector,
> + ZBC_ZONE_REPORTING_OPTION_ALL, true);
> + if (ret)
> + break;
> +
> + ret = zbc_parse_zones(sdkp, buf, bufsize, §or);
> + if (ret)
> + break;
> +
> + /* Kick start the queue to allow requests waiting */
> + /* for the zones just updated to run */
> + sd_zbc_start_queue(q);
> +
> + }
> +
> +done_free:
> + if (ret)
> + sd_zbc_clear_zones_updating(sdkp, zwork->sector, zwork->nr_sects);
> + if (buf)
> + kfree(buf);
> + kfree(zwork);
> +}
> +
> +/**
> + * sd_zbc_update_zones - Update zone information for zones starting
> + * from @start_sector. If not in init mode, the update is done only
> + * for zones marked with update flag.
> + * @sdkp: SCSI disk for which the zone information needs to be updated
> + * @start_sector: First sector of the first zone to be updated
> + * @bufsize: buffersize to be allocated for report zones
> + */
> +static int sd_zbc_update_zones(struct scsi_disk *sdkp,
> + sector_t sector, sector_t nr_sects,
> + gfp_t gfpflags, bool init)
> +{
> + struct zbc_zone_work *zwork;
> +
> + zwork = kzalloc(sizeof(struct zbc_zone_work), gfpflags);
> + if (!zwork) {
> + sd_zbc_err(sdkp, "Failed to allocate zone work\n");
> + return -ENOMEM;
> + }
> +
> + if (!nr_sects) {
> + /* All zones */
> + sector = 0;
> + nr_sects = logical_to_sectors(sdkp->device, sdkp->capacity);
> + }
> +
> + INIT_WORK(&zwork->zone_work, sd_zbc_update_zone_work);
> + zwork->sdkp = sdkp;
> + zwork->sector = sector;
> + zwork->nr_sects = nr_sects;
> + zwork->init = init;
> +
> + if (!init)
> + /* Mark the zones falling in the report as updating */
> + zwork->nr_zones = sd_zbc_set_zones_updating(sdkp, sector, nr_sects);
> +
> + if (init || zwork->nr_zones)
> + queue_work(sdkp->zone_work_q, &zwork->zone_work);
> + else
> + kfree(zwork);
> +
> + return 0;
> +}
> +
> +int sd_zbc_setup_report_cmnd(struct scsi_cmnd *cmd)
> +{
> + struct request *rq = cmd->request;
> + struct gendisk *disk = rq->rq_disk;
> + struct scsi_disk *sdkp = scsi_disk(disk);
> + int ret;
> +
> + if (!sdkp->zone_work_q)
> + return BLKPREP_KILL;
> +
> + ret = sd_zbc_update_zones(sdkp, blk_rq_pos(rq), blk_rq_sectors(rq),
> + GFP_ATOMIC, false);
> + if (unlikely(ret))
> + return BLKPREP_DEFER;
> +
> + return BLKPREP_DONE;
> +}
> +
> +static void sd_zbc_setup_action_cmnd(struct scsi_cmnd *cmd,
> + u8 action,
> + bool all)
> +{
> + struct request *rq = cmd->request;
> + struct scsi_disk *sdkp = scsi_disk(rq->rq_disk);
> + sector_t lba;
> +
> + cmd->cmd_len = 16;
> + cmd->cmnd[0] = ZBC_OUT;
> + cmd->cmnd[1] = action;
> + if (all) {
> + cmd->cmnd[14] |= 0x01;
> + } else {
> + lba = sectors_to_logical(sdkp->device, blk_rq_pos(rq));
> + put_unaligned_be64(lba, &cmd->cmnd[2]);
> + }
> +
> + rq->completion_data = NULL;
> + rq->timeout = SD_TIMEOUT;
> + rq->__data_len = blk_rq_bytes(rq);
> +
> + /* Don't retry */
> + cmd->allowed = 0;
> + cmd->transfersize = 0;
> + cmd->sc_data_direction = DMA_NONE;
> +}
> +
> +int sd_zbc_setup_reset_cmnd(struct scsi_cmnd *cmd)
> +{
> + struct request *rq = cmd->request;
> + struct scsi_disk *sdkp = scsi_disk(rq->rq_disk);
> + sector_t sector = blk_rq_pos(rq);
> + sector_t nr_sects = blk_rq_sectors(rq);
> + struct blk_zone *zone = NULL;
> + int ret = BLKPREP_OK;
> +
> + if (nr_sects) {
> + zone = blk_lookup_zone(rq->q, sector);
> + if (!zone)
> + return BLKPREP_KILL;
> + }
> +
> + if (zone) {
> +
> + blk_lock_zone(zone);
> +
> + /* If the zone is being updated, wait */
> + if (blk_zone_in_update(zone)) {
> + ret = BLKPREP_DEFER;
> + goto out;
> + }
> +
> + if (zone->type == BLK_ZONE_TYPE_UNKNOWN) {
> + sd_zbc_debug(sdkp,
> + "Discarding unknown zone %zu\n",
> + zone->start);
> + ret = BLKPREP_KILL;
> + goto out;
> + }
> +
> + /* Nothing to do for conventional sequential zones */
> + if (blk_zone_is_conv(zone)) {
> + ret = BLKPREP_DONE;
> + goto out;
> + }
> +
> + if (!blk_try_write_lock_zone(zone)) {
> + ret = BLKPREP_DEFER;
> + goto out;
> + }
> +
> + /* Nothing to do if the zone is already empty */
> + if (blk_zone_is_empty(zone)) {
> + blk_write_unlock_zone(zone);
> + ret = BLKPREP_DONE;
> + goto out;
> + }
> +
> + if (sector != zone->start ||
> + (nr_sects != zone->len)) {
> + sd_printk(KERN_ERR, sdkp,
> + "Unaligned reset wp request, start %zu/%zu"
> + " len %zu/%zu\n",
> + zone->start, sector, zone->len, nr_sects);
> + blk_write_unlock_zone(zone);
> + ret = BLKPREP_KILL;
> + goto out;
> + }
> +
> + }
> +
> + sd_zbc_setup_action_cmnd(cmd, ZO_RESET_WRITE_POINTER, !zone);
> +
> +out:
> + if (zone) {
> + if (ret == BLKPREP_OK) {
> + /*
> + * Opportunistic update. Will be fixed up
> + * with zone update if the command fails,
> + */
> + zone->wp = zone->start;
> + zone->cond = BLK_ZONE_COND_EMPTY;
> + zone->reset = 0;
> + zone->non_seq = 0;
> + }
> + blk_unlock_zone(zone);
> + }
> +
> + return ret;
> +}
> +
> +int sd_zbc_setup_open_cmnd(struct scsi_cmnd *cmd)
> +{
> + struct request *rq = cmd->request;
> + struct scsi_disk *sdkp = scsi_disk(rq->rq_disk);
> + sector_t sector = blk_rq_pos(rq);
> + sector_t nr_sects = blk_rq_sectors(rq);
> + struct blk_zone *zone = NULL;
> + int ret = BLKPREP_OK;
> +
> + if (nr_sects) {
> + zone = blk_lookup_zone(rq->q, sector);
> + if (!zone)
> + return BLKPREP_KILL;
> + }
> +
> + if (zone) {
> +
> + blk_lock_zone(zone);
> +
> + /* If the zone is being updated, wait */
> + if (blk_zone_in_update(zone)) {
> + ret = BLKPREP_DEFER;
> + goto out;
> + }
> +
> + if (zone->type == BLK_ZONE_TYPE_UNKNOWN) {
> + sd_zbc_debug(sdkp,
> + "Opening unknown zone %zu\n",
> + zone->start);
> + ret = BLKPREP_KILL;
> + goto out;
> + }
> +
> + /*
> + * Nothing to do for conventional zones,
> + * zones already open or full zones.
> + */
> + if (blk_zone_is_conv(zone) ||
> + blk_zone_is_open(zone) ||
> + blk_zone_is_full(zone)) {
> + ret = BLKPREP_DONE;
> + goto out;
> + }
> +
> + if (sector != zone->start ||
> + (nr_sects != zone->len)) {
> + sd_printk(KERN_ERR, sdkp,
> + "Unaligned open zone request, start %zu/%zu"
> + " len %zu/%zu\n",
> + zone->start, sector, zone->len, nr_sects);
> + ret = BLKPREP_KILL;
> + goto out;
> + }
> +
> + }
> +
> + sd_zbc_setup_action_cmnd(cmd, ZO_OPEN_ZONE, !zone);
> +
> +out:
> + if (zone) {
> + if (ret == BLKPREP_OK)
> + /*
> + * Opportunistic update. Will be fixed up
> + * with zone update if the command fails.
> + */
> + zone->cond = BLK_ZONE_COND_EXP_OPEN;
> + blk_unlock_zone(zone);
> + }
> +
> + return ret;
> +}
> +
> +int sd_zbc_setup_close_cmnd(struct scsi_cmnd *cmd)
> +{
> + struct request *rq = cmd->request;
> + struct scsi_disk *sdkp = scsi_disk(rq->rq_disk);
> + sector_t sector = blk_rq_pos(rq);
> + sector_t nr_sects = blk_rq_sectors(rq);
> + struct blk_zone *zone = NULL;
> + int ret = BLKPREP_OK;
> +
> + if (nr_sects) {
> + zone = blk_lookup_zone(rq->q, sector);
> + if (!zone)
> + return BLKPREP_KILL;
> + }
> +
> + if (zone) {
> +
> + blk_lock_zone(zone);
> +
> + /* If the zone is being updated, wait */
> + if (blk_zone_in_update(zone)) {
> + ret = BLKPREP_DEFER;
> + goto out;
> + }
> +
> + if (zone->type == BLK_ZONE_TYPE_UNKNOWN) {
> + sd_zbc_debug(sdkp,
> + "Closing unknown zone %zu\n",
> + zone->start);
> + ret = BLKPREP_KILL;
> + goto out;
> + }
> +
> + /*
> + * Nothing to do for conventional zones,
> + * full zones or empty zones.
> + */
> + if (blk_zone_is_conv(zone) ||
> + blk_zone_is_full(zone) ||
> + blk_zone_is_empty(zone)) {
> + ret = BLKPREP_DONE;
> + goto out;
> + }
> +
> + if (sector != zone->start ||
> + (nr_sects != zone->len)) {
> + sd_printk(KERN_ERR, sdkp,
> + "Unaligned close zone request, start %zu/%zu"
> + " len %zu/%zu\n",
> + zone->start, sector, zone->len, nr_sects);
> + ret = BLKPREP_KILL;
> + goto out;
> + }
> +
> + }
> +
> + sd_zbc_setup_action_cmnd(cmd, ZO_CLOSE_ZONE, !zone);
> +
> +out:
> + if (zone) {
> + if (ret == BLKPREP_OK)
> + /*
> + * Opportunistic update. Will be fixed up
> + * with zone update if the command fails.
> + */
> + zone->cond = BLK_ZONE_COND_CLOSED;
> + blk_unlock_zone(zone);
> + }
> +
> + return ret;
> +}
> +
> +int sd_zbc_setup_finish_cmnd(struct scsi_cmnd *cmd)
> +{
> + struct request *rq = cmd->request;
> + struct scsi_disk *sdkp = scsi_disk(rq->rq_disk);
> + sector_t sector = blk_rq_pos(rq);
> + sector_t nr_sects = blk_rq_sectors(rq);
> + struct blk_zone *zone = NULL;
> + int ret = BLKPREP_OK;
> +
> + if (nr_sects) {
> + zone = blk_lookup_zone(rq->q, sector);
> + if (!zone)
> + return BLKPREP_KILL;
> + }
> +
> + if (zone) {
> +
> + blk_lock_zone(zone);
> +
> + /* If the zone is being updated, wait */
> + if (blk_zone_in_update(zone)) {
> + ret = BLKPREP_DEFER;
> + goto out;
> + }
> +
> + if (zone->type == BLK_ZONE_TYPE_UNKNOWN) {
> + sd_zbc_debug(sdkp,
> + "Finishing unknown zone %zu\n",
> + zone->start);
> + ret = BLKPREP_KILL;
> + goto out;
> + }
> +
> + /* Nothing to do for conventional zones and full zones */
> + if (blk_zone_is_conv(zone) ||
> + blk_zone_is_full(zone)) {
> + ret = BLKPREP_DONE;
> + goto out;
> + }
> +
> + if (sector != zone->start ||
> + (nr_sects != zone->len)) {
> + sd_printk(KERN_ERR, sdkp,
> + "Unaligned finish zone request, start %zu/%zu"
> + " len %zu/%zu\n",
> + zone->start, sector, zone->len, nr_sects);
> + ret = BLKPREP_KILL;
> + goto out;
> + }
> +
> + }
> +
> + sd_zbc_setup_action_cmnd(cmd, ZO_FINISH_ZONE, !zone);
> +
> +out:
> + if (zone) {
> + if (ret == BLKPREP_OK) {
> + /*
> + * Opportunistic update. Will be fixed up
> + * with zone update if the command fails.
> + */
> + zone->cond = BLK_ZONE_COND_FULL;
> + if (blk_zone_is_seq(zone))
> + zone->wp = zone->start + zone->len;
> + }
> + blk_unlock_zone(zone);
> + }
> +
> + return ret;
> +}
> +
Would be nice to have open/close/finish/reset share a little more code.
> +int sd_zbc_setup_read_write(struct scsi_disk *sdkp, struct request *rq,
> + sector_t sector, unsigned int *num_sectors)
> +{
> + struct blk_zone *zone;
> + unsigned int sectors = *num_sectors;
> + int ret = BLKPREP_OK;
> +
> + zone = blk_lookup_zone(rq->q, sector);
> + if (!zone)
> + /* Let the drive handle the request */
> + return BLKPREP_OK;
> +
> + blk_lock_zone(zone);
> +
> + /* If the zone is being updated, wait */
> + if (blk_zone_in_update(zone)) {
> + ret = BLKPREP_DEFER;
> + goto out;
> + }
> +
> + if (zone->type == BLK_ZONE_TYPE_UNKNOWN) {
> + sd_zbc_debug(sdkp,
> + "Unknown zone %zu\n",
> + zone->start);
> + ret = BLKPREP_KILL;
> + goto out;
> + }
> +
> + /* For offline and read-only zones, let the drive fail the command */
> + if (blk_zone_is_offline(zone) ||
> + blk_zone_is_readonly(zone))
> + goto out;
> +
> + /* Do not allow zone boundaries crossing */
> + if (sector + sectors > zone->start + zone->len) {
> + ret = BLKPREP_KILL;
> + goto out;
> + }
> +
> + /* For conventional zones, no checks */
> + if (blk_zone_is_conv(zone))
> + goto out;
> +
> + if (req_op(rq) == REQ_OP_WRITE ||
> + req_op(rq) == REQ_OP_WRITE_SAME) {
> +
> + /*
> + * Write requests may change the write pointer and
> + * transition the zone condition to full. Changes
> + * are oportunistic here. If the request fails, a
> + * zone update will fix the zone information.
> + */
> + if (blk_zone_is_seq_req(zone)) {
> +
> + /*
> + * Do not issue more than one write at a time per
> + * zone. This solves write ordering problems due to
> + * the unlocking of the request queue in the dispatch
> + * path in the non scsi-mq case. For scsi-mq, this
> + * also avoids potential write reordering when multiple
> + * threads running on different CPUs write to the same
> + * zone (with a synchronized sequential pattern).
> + */
> + if (!blk_try_write_lock_zone(zone)) {
> + ret = BLKPREP_DEFER;
> + goto out;
> + }
> +
> + /* For host-managed drives, writes are allowed */
> + /* only at the write pointer position. */
> + if (zone->wp != sector) {
> + blk_write_unlock_zone(zone);
> + ret = BLKPREP_KILL;
> + goto out;
> + }
> +
> + zone->wp += sectors;
> + if (zone->wp >= zone->start + zone->len) {
> + zone->cond = BLK_ZONE_COND_FULL;
> + zone->wp = zone->start + zone->len;
> + }
> +
> + } else {
> +
> + /* For host-aware drives, writes are allowed */
> + /* anywhere in the zone, but wp can only go */
> + /* forward. */
> + sector_t end_sector = sector + sectors;
> + if (sector == zone->wp &&
> + end_sector >= zone->start + zone->len) {
> + zone->cond = BLK_ZONE_COND_FULL;
> + zone->wp = zone->start + zone->len;
> + } else if (end_sector > zone->wp) {
> + zone->wp = end_sector;
> + }
> +
> + }
> +
> + } else {
> +
If the drive does not have restricted reads
the just goto out here.
Not all HM drives will have restricted reads and
no HA drives have restricted reads.
> + /* Check read after write pointer */
> + if (sector + sectors <= zone->wp)
> + goto out;
> +
> + if (zone->wp <= sector) {
> + /* Read beyond WP: clear request buffer */
> + struct req_iterator iter;
> + struct bio_vec bvec;
> + unsigned long flags;
> + void *buf;
> + rq_for_each_segment(bvec, rq, iter) {
> + buf = bvec_kmap_irq(&bvec, &flags);
> + memset(buf, 0, bvec.bv_len);
> + flush_dcache_page(bvec.bv_page);
> + bvec_kunmap_irq(buf, &flags);
> + }
> + ret = BLKPREP_DONE;
> + goto out;
> + }
> +
> + /* Read straddle WP position: limit request size */
> + *num_sectors = zone->wp - sector;
> +
> + }
> +
> +out:
> + blk_unlock_zone(zone);
> +
> + return ret;
> +}
> +
> +void sd_zbc_done(struct scsi_cmnd *cmd,
> + struct scsi_sense_hdr *sshdr)
> +{
> + int result = cmd->result;
> + struct request *rq = cmd->request;
> + struct scsi_disk *sdkp = scsi_disk(rq->rq_disk);
> + struct request_queue *q = sdkp->disk->queue;
> + sector_t pos = blk_rq_pos(rq);
> + struct blk_zone *zone = NULL;
> + bool write_unlock = false;
> +
> + /*
> + * Get the target zone of commands of interest. Some may
> + * apply to all zones so check the request sectors first.
> + */
> + switch (req_op(rq)) {
> + case REQ_OP_DISCARD:
> + case REQ_OP_WRITE:
> + case REQ_OP_WRITE_SAME:
> + case REQ_OP_ZONE_RESET:
> + write_unlock = true;
> + /* fallthru */
> + case REQ_OP_ZONE_OPEN:
> + case REQ_OP_ZONE_CLOSE:
> + case REQ_OP_ZONE_FINISH:
> + if (blk_rq_sectors(rq))
> + zone = blk_lookup_zone(q, pos);
> + break;
> + }
> +
> + if (zone && write_unlock)
> + blk_write_unlock_zone(zone);
> +
> + if (!result)
> + return;
> +
> + if (sshdr->sense_key == ILLEGAL_REQUEST &&
> + sshdr->asc == 0x21)
> + /*
> + * It is unlikely that retrying requests failed with any
> + * kind of alignement error will result in success. So don't
> + * try. Report the error back to the user quickly so that
> + * corrective actions can be taken after obtaining updated
> + * zone information.
> + */
> + cmd->allowed = 0;
> +
> + /* On error, force an update unless this is a failed report */
> + if (req_op(rq) == REQ_OP_ZONE_REPORT)
> + sd_zbc_clear_zones_updating(sdkp, pos, blk_rq_sectors(rq));
> + else if (zone)
> + sd_zbc_update_zones(sdkp, zone->start, zone->len,
> + GFP_ATOMIC, false);
> +}
> +
> +void sd_zbc_read_zones(struct scsi_disk *sdkp, char *buf)
> +{
> + struct request_queue *q = sdkp->disk->queue;
> + struct blk_zone *zone;
> + sector_t capacity;
> + sector_t sector;
> + bool init = false;
> + u32 rep_len;
> + int ret = 0;
> +
> + if (sdkp->zoned != 1 && sdkp->device->type != TYPE_ZBC)
> + /*
> + * Device managed or normal SCSI disk,
> + * no special handling required
> + */
> + return;
> +
> + /* Do a report zone to get the maximum LBA to check capacity */
> + ret = sd_zbc_report_zones(sdkp, buf, SD_BUF_SIZE,
> + 0, ZBC_ZONE_REPORTING_OPTION_ALL, false);
> + if (ret < 0)
> + return;
> +
> + rep_len = get_unaligned_be32(&buf[0]);
> + if (rep_len < 64) {
> + sd_printk(KERN_WARNING, sdkp,
> + "REPORT ZONES report invalid length %u\n",
> + rep_len);
> + return;
> + }
> +
> + if (sdkp->rc_basis == 0) {
> + /* The max_lba field is the capacity of this device */
> + sector_t lba = get_unaligned_be64(&buf[8]);
> + if (lba + 1 > sdkp->capacity) {
> + if (sdkp->first_scan)
> + sd_printk(KERN_WARNING, sdkp,
> + "Changing capacity from %zu "
> + "to max LBA+1 %zu\n",
> + sdkp->capacity,
> + (sector_t) lba + 1);
> + sdkp->capacity = lba + 1;
> + }
> + }
> +
> + /* Setup the zone work queue */
> + if (! sdkp->zone_work_q) {
> + sdkp->zone_work_q =
> + alloc_ordered_workqueue("zbc_wq_%s", WQ_MEM_RECLAIM,
> + sdkp->disk->disk_name);
> + if (!sdkp->zone_work_q) {
> + sdev_printk(KERN_WARNING, sdkp->device,
> + "Create zoned disk workqueue failed\n");
> + return;
> + }
> + init = true;
> + }
> +
> + /*
> + * Parse what we already got. If all zones are not parsed yet,
> + * kick start an update to get the remaining.
> + */
> + capacity = logical_to_sectors(sdkp->device, sdkp->capacity);
> + ret = zbc_parse_zones(sdkp, buf, SD_BUF_SIZE, §or);
> + if (ret == 0 && sector < capacity) {
> + sd_zbc_update_zones(sdkp, sector, capacity - sector,
> + GFP_KERNEL, init);
> + drain_workqueue(sdkp->zone_work_q);
> + }
> + if (ret)
> + return;
> +
> + /*
> + * Analyze the zones layout: if all zones are the same size and
> + * the size is a power of 2, chunk the device and map discard to
> + * reset write pointer command. Otherwise, disable discard.
> + */
> + sdkp->zone_sectors = 0;
> + sdkp->nr_zones = 0;
> + sector = 0;
> + while(sector < capacity) {
> +
> + zone = blk_lookup_zone(q, sector);
> + if (!zone) {
> + sdkp->zone_sectors = 0;
> + sdkp->nr_zones = 0;
> + break;
> + }
> +
> + sector += zone->len;
> +
> + if (sdkp->zone_sectors == 0) {
> + sdkp->zone_sectors = zone->len;
> + } else if (sector != capacity &&
> + zone->len != sdkp->zone_sectors) {
> + sdkp->zone_sectors = 0;
> + sdkp->nr_zones = 0;
> + break;
> + }
> +
> + sdkp->nr_zones++;
> +
> + }
> +
> + if (!sdkp->zone_sectors ||
> + !is_power_of_2(sdkp->zone_sectors)) {
> + sd_config_discard(sdkp, SD_LBP_DISABLE);
> + if (sdkp->first_scan)
> + sd_printk(KERN_NOTICE, sdkp,
> + "%u zones (non constant zone size)\n",
> + sdkp->nr_zones);
> + return;
> + }
> +
> + /* Setup discard granularity to the zone size */
> + blk_queue_chunk_sectors(sdkp->disk->queue, sdkp->zone_sectors);
> + sdkp->max_unmap_blocks = sdkp->zone_sectors;
> + sdkp->unmap_alignment = sectors_to_logical(sdkp->device,
> + sdkp->zone_sectors);
> + sdkp->unmap_granularity = sdkp->unmap_alignment;
> + sd_config_discard(sdkp, SD_ZBC_RESET_WP);
> +
> + if (sdkp->first_scan) {
> + if (sdkp->nr_zones * sdkp->zone_sectors == capacity)
> + sd_printk(KERN_NOTICE, sdkp,
> + "%u zones of %zu sectors\n",
> + sdkp->nr_zones,
> + sdkp->zone_sectors);
> + else
> + sd_printk(KERN_NOTICE, sdkp,
> + "%u zones of %zu sectors "
> + "+ 1 runt zone\n",
> + sdkp->nr_zones - 1,
> + sdkp->zone_sectors);
> + }
> +}
> +
> +void sd_zbc_remove(struct scsi_disk *sdkp)
> +{
> +
> + sd_config_discard(sdkp, SD_LBP_DISABLE);
> +
> + if (sdkp->zone_work_q) {
> + drain_workqueue(sdkp->zone_work_q);
> + destroy_workqueue(sdkp->zone_work_q);
> + sdkp->zone_work_q = NULL;
> + blk_drop_zones(sdkp->disk->queue);
> + }
> +}
> +
> diff --git a/include/scsi/scsi_proto.h b/include/scsi/scsi_proto.h
> index d1defd1..6ba66e0 100644
> --- a/include/scsi/scsi_proto.h
> +++ b/include/scsi/scsi_proto.h
> @@ -299,4 +299,21 @@ struct scsi_lun {
> #define SCSI_ACCESS_STATE_MASK 0x0f
> #define SCSI_ACCESS_STATE_PREFERRED 0x80
>
> +/* Reporting options for REPORT ZONES */
> +enum zbc_zone_reporting_options {
> + ZBC_ZONE_REPORTING_OPTION_ALL = 0,
> + ZBC_ZONE_REPORTING_OPTION_EMPTY,
> + ZBC_ZONE_REPORTING_OPTION_IMPLICIT_OPEN,
> + ZBC_ZONE_REPORTING_OPTION_EXPLICIT_OPEN,
> + ZBC_ZONE_REPORTING_OPTION_CLOSED,
> + ZBC_ZONE_REPORTING_OPTION_FULL,
> + ZBC_ZONE_REPORTING_OPTION_READONLY,
> + ZBC_ZONE_REPORTING_OPTION_OFFLINE,
> + ZBC_ZONE_REPORTING_OPTION_NEED_RESET_WP = 0x10,
> + ZBC_ZONE_REPORTING_OPTION_NON_SEQWRITE,
> + ZBC_ZONE_REPORTING_OPTION_NON_WP = 0x3f,
> +};
> +
> +#define ZBC_REPORT_ZONE_PARTIAL 0x80
> +
Why don't we expose these enums via uapi?
> #endif /* _SCSI_PROTO_H_ */
> --
> 2.7.4
>
> Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice & Disclaimer:
>
> This e-mail and any files transmitted with it may contain confidential or legally privileged information of WDC and/or its affiliates, and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system.
>
--
Shaun Tancheff
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 9/9] blk-zoned: Add ioctl interface for zone operations
2016-09-19 21:27 ` Damien Le Moal
@ 2016-09-20 6:02 ` Shaun Tancheff
-1 siblings, 0 replies; 36+ messages in thread
From: Shaun Tancheff @ 2016-09-20 6:02 UTC (permalink / raw)
To: Damien Le Moal
Cc: linux-scsi, linux-block, Martin K. Petersen, Jens Axboe, Hannes Reinecke
On Mon, Sep 19, 2016 at 4:27 PM, Damien Le Moal <damien.lemoal@hgst.com> wr=
ote:
> From: Shaun Tancheff <shaun.tancheff@seagate.com>
>
> Adds the new BLKUPDATEZONES, BLKREPORTZONE, BLKRESETZONE,
> BLKOPENZONE, BLKCLOSEZONE and BLKFINISHZONE ioctls.
>
> BLKREPORTZONE implementation uses the device queue zone RB-tree by
> default and no actual command is issued to the device. If the
> application needs access to the untracked zone attributes (non-seq
> flag or reset recommended flag, offline or read-only zone condition,
> etc), BLKUPDATEZONES must be issued first to force an update of the
> cached zone information.
>
> Changelog (Damien):
> * Simplified blkzone descriptor (removed bit-fields and use CPU
> endianness)
> * Changed report ioctl to operate on single zone instead of an
> array of blkzone structures.
I think something with this degree of changes from what
I posted should not include my signed-off-by.
I also really don't like forcing the reply to be a single zone. I
think the user should be able to ask for as many or as few as
they would like.
> Signed-off-by: Shaun Tancheff <shaun.tancheff@seagate.com>
> Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
> ---
> block/blk-zoned.c | 115 ++++++++++++++++++++++++++++++++++++=
++++++
> block/ioctl.c | 8 +++
> include/linux/blkdev.h | 7 +++
> include/uapi/linux/Kbuild | 1 +
> include/uapi/linux/blkzoned.h | 91 +++++++++++++++++++++++++++++++++
> include/uapi/linux/fs.h | 1 +
> 6 files changed, 223 insertions(+)
> create mode 100644 include/uapi/linux/blkzoned.h
>
> diff --git a/block/blk-zoned.c b/block/blk-zoned.c
> index a107940..71205c8 100644
> --- a/block/blk-zoned.c
> +++ b/block/blk-zoned.c
> @@ -12,6 +12,7 @@
> #include <linux/module.h>
> #include <linux/rbtree.h>
> #include <linux/blkdev.h>
> +#include <linux/blkzoned.h>
>
> void blk_init_zones(struct request_queue *q)
> {
> @@ -336,3 +337,117 @@ int blkdev_finish_zone(struct block_device *bdev,
> return blkdev_issue_zone_action(bdev, sector, REQ_OP_ZONE_FINISH,
> gfp_mask);
> }
> +
> +static int blkdev_report_zone_ioctl(struct block_device *bdev,
> + void __user *argp)
> +{
> + struct blk_zone *zone;
> + struct blkzone z;
> +
> + if (copy_from_user(&z, argp, sizeof(struct blkzone)))
> + return -EFAULT;
> +
> + zone =3D blk_lookup_zone(bdev_get_queue(bdev), z.start);
> + if (!zone)
> + return -EINVAL;
> +
> + memset(&z, 0, sizeof(struct blkzone));
> +
> + blk_lock_zone(zone);
> +
> + blk_wait_for_zone_update(zone);
> +
> + z.len =3D zone->len;
> + z.start =3D zone->start;
> + z.wp =3D zone->wp;
> + z.type =3D zone->type;
> + z.cond =3D zone->cond;
> + z.non_seq =3D zone->non_seq;
> + z.reset =3D zone->reset;
> +
> + blk_unlock_zone(zone);
> +
> + if (copy_to_user(argp, &z, sizeof(struct blkzone)))
> + return -EFAULT;
> +
> + return 0;
> +}
> +
> +static int blkdev_zone_action_ioctl(struct block_device *bdev,
> + unsigned cmd, void __user *argp)
> +{
> + unsigned int op;
> + u64 sector;
> +
> + if (get_user(sector, (u64 __user *)argp))
> + return -EFAULT;
> +
> + switch (cmd) {
> + case BLKRESETZONE:
> + op =3D REQ_OP_ZONE_RESET;
> + break;
> + case BLKOPENZONE:
> + op =3D REQ_OP_ZONE_OPEN;
> + break;
> + case BLKCLOSEZONE:
> + op =3D REQ_OP_ZONE_CLOSE;
> + break;
> + case BLKFINISHZONE:
> + op =3D REQ_OP_ZONE_FINISH;
> + break;
> + }
> +
> + return blkdev_issue_zone_action(bdev, sector, op, GFP_KERNEL);
> +}
> +
> +/**
> + * Called from blkdev_ioctl.
> + */
> +int blkdev_zone_ioctl(struct block_device *bdev, fmode_t mode,
> + unsigned cmd, unsigned long arg)
> +{
> + void __user *argp =3D (void __user *)arg;
> + struct request_queue *q;
> + int ret;
> +
> + if (!argp)
> + return -EINVAL;
> +
> + q =3D bdev_get_queue(bdev);
> + if (!q)
> + return -ENXIO;
> +
> + if (!blk_queue_zoned(q))
> + return -ENOTTY;
> +
> + if (!capable(CAP_SYS_ADMIN))
> + return -EACCES;
> +
> + switch (cmd) {
> + case BLKREPORTZONE:
> + ret =3D blkdev_report_zone_ioctl(bdev, argp);
> + break;
> + case BLKUPDATEZONES:
> + if (!(mode & FMODE_WRITE)) {
> + ret =3D -EBADF;
> + break;
> + }
> + ret =3D blkdev_update_zones(bdev, GFP_KERNEL);
> + break;
> + case BLKRESETZONE:
> + case BLKOPENZONE:
> + case BLKCLOSEZONE:
> + case BLKFINISHZONE:
> + if (!(mode & FMODE_WRITE)) {
> + ret =3D -EBADF;
> + break;
> + }
> + ret =3D blkdev_zone_action_ioctl(bdev, cmd, argp);
> + break;
> + default:
> + ret =3D -ENOTTY;
> + break;
> + }
> +
> + return ret;
> +}
> diff --git a/block/ioctl.c b/block/ioctl.c
> index ed2397f..f09679a 100644
> --- a/block/ioctl.c
> +++ b/block/ioctl.c
> @@ -3,6 +3,7 @@
> #include <linux/export.h>
> #include <linux/gfp.h>
> #include <linux/blkpg.h>
> +#include <linux/blkzoned.h>
> #include <linux/hdreg.h>
> #include <linux/backing-dev.h>
> #include <linux/fs.h>
> @@ -513,6 +514,13 @@ int blkdev_ioctl(struct block_device *bdev, fmode_t =
mode, unsigned cmd,
> BLKDEV_DISCARD_SECURE);
> case BLKZEROOUT:
> return blk_ioctl_zeroout(bdev, mode, arg);
> + case BLKUPDATEZONES:
> + case BLKREPORTZONE:
> + case BLKRESETZONE:
> + case BLKOPENZONE:
> + case BLKCLOSEZONE:
> + case BLKFINISHZONE:
> + return blkdev_zone_ioctl(bdev, mode, cmd, arg);
> case HDIO_GETGEO:
> return blkdev_getgeo(bdev, argp);
> case BLKRAGET:
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index a85f95b..0299d41 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -405,9 +405,16 @@ extern int blkdev_reset_zone(struct block_device *, =
sector_t, gfp_t);
> extern int blkdev_open_zone(struct block_device *, sector_t, gfp_t);
> extern int blkdev_close_zone(struct block_device *, sector_t, gfp_t);
> extern int blkdev_finish_zone(struct block_device *, sector_t, gfp_t);
> +extern int blkdev_zone_ioctl(struct block_device *, fmode_t, unsigned in=
t,
> + unsigned long);
> #else /* CONFIG_BLK_DEV_ZONED */
> static inline void blk_init_zones(struct request_queue *q) { };
> static inline void blk_drop_zones(struct request_queue *q) { };
> +static inline int blkdev_zone_ioctl(struct block_device *bdev, fmode_t m=
ode,
> + unsigned cmd, unsigned long arg)
> +{
> + return -ENOTTY;
> +}
> #endif /* CONFIG_BLK_DEV_ZONED */
>
> struct request_queue {
> diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild
> index 185f8ea..a2a7522 100644
> --- a/include/uapi/linux/Kbuild
> +++ b/include/uapi/linux/Kbuild
> @@ -70,6 +70,7 @@ header-y +=3D bfs_fs.h
> header-y +=3D binfmts.h
> header-y +=3D blkpg.h
> header-y +=3D blktrace_api.h
> +header-y +=3D blkzoned.h
> header-y +=3D bpf_common.h
> header-y +=3D bpf.h
> header-y +=3D bpqether.h
> diff --git a/include/uapi/linux/blkzoned.h b/include/uapi/linux/blkzoned.=
h
> new file mode 100644
> index 0000000..23a2702
> --- /dev/null
> +++ b/include/uapi/linux/blkzoned.h
> @@ -0,0 +1,91 @@
> +/*
> + * Zoned block devices handling.
> + *
> + * Copyright (C) 2015 Seagate Technology PLC
> + *
> + * Written by: Shaun Tancheff <shaun.tancheff@seagate.com>
> + *
> + * Modified by: Damien Le Moal <damien.lemoal@hgst.com>
> + * Copyright (C) 2016 Western Digital
> + *
> + * This file is licensed under the terms of the GNU General Public
> + * License version 2. This program is licensed "as is" without any
> + * warranty of any kind, whether express or implied.
> + */
> +#ifndef _UAPI_BLKZONED_H
> +#define _UAPI_BLKZONED_H
> +
> +#include <linux/types.h>
> +#include <linux/ioctl.h>
> +
> +/*
> + * Zone type.
> + */
> +enum blkzone_type {
> + BLKZONE_TYPE_UNKNOWN,
> + BLKZONE_TYPE_CONVENTIONAL,
> + BLKZONE_TYPE_SEQWRITE_REQ,
> + BLKZONE_TYPE_SEQWRITE_PREF,
> +};
> +
> +/*
> + * Zone condition.
> + */
> +enum blkzone_cond {
> + BLKZONE_COND_NO_WP,
> + BLKZONE_COND_EMPTY,
> + BLKZONE_COND_IMP_OPEN,
> + BLKZONE_COND_EXP_OPEN,
> + BLKZONE_COND_CLOSED,
> + BLKZONE_COND_READONLY =3D 0xd,
> + BLKZONE_COND_FULL,
> + BLKZONE_COND_OFFLINE,
> +};
> +
> +/*
> + * Zone descriptor for BLKREPORTZONE.
> + * start, len and wp use the regulare 512 B sector unit,
> + * regardless of the device logical block size. The overall
> + * structure size is 64 B to match the ZBC/ZAC defined zone descriptor
> + * and allow support for future additional zone information.
> + */
> +struct blkzone {
> + __u64 start; /* Zone start sector */
> + __u64 len; /* Zone length in number of sectors */
> + __u64 wp; /* Zone write pointer position */
> + __u8 type; /* Zone type */
> + __u8 cond; /* Zone condition */
> + __u8 non_seq; /* Non-sequential write resources active =
*/
> + __u8 reset; /* Reset write pointer recommended */
> + __u8 reserved[36];
> +};
> +
> +/*
> + * Zone ioctl's:
> + *
> + * BLKUPDATEZONES : Force update of all zones information
> + * BLKREPORTZONE : Get a zone descriptor. Takes a zone descriptor =
as
> + * argument. The zone to report is the one
> + * containing the sector initially specified in t=
he
> + * descriptor start field.
> + * BLKRESETZONE : Reset the write pointer of the zone con=
taining the
> + * specified sector, or of all written zones if t=
he
> + * sector is ~0ull.
> + * BLKOPENZONE : Explicitely open the zone containing the
> + * specified sector, or all possible zones if the
> + * sector is ~0ull (the drive determines which zo=
ne
> + * to open in this case).
> + * BLKCLOSEZONE : Close the zone containing the specified=
sector, or
> + * all open zones if the sector is ~0ull.
> + * BLKFINISHZONE : Finish the zone (make it full) containing the
> + * specified sector, or all open and closed zones=
if
> + * the sector is ~0ull.
> + */
> +#define BLKUPDATEZONES _IO(0x12,130)
> +#define BLKREPORTZONE _IOWR(0x12,131,struct blkzone)
> +#define BLKRESETZONE _IOW(0x12,132,unsigned long long)
> +#define BLKOPENZONE _IOW(0x12,133,unsigned long long)
> +#define BLKCLOSEZONE _IOW(0x12,134,unsigned long long)
> +#define BLKFINISHZONE _IOW(0x12,135,unsigned long long)
> +
> +#endif /* _UAPI_BLKZONED_H */
> diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
> index 3b00f7c..1db6d66 100644
> --- a/include/uapi/linux/fs.h
> +++ b/include/uapi/linux/fs.h
> @@ -222,6 +222,7 @@ struct fsxattr {
> #define BLKSECDISCARD _IO(0x12,125)
> #define BLKROTATIONAL _IO(0x12,126)
> #define BLKZEROOUT _IO(0x12,127)
> +/* A jump here: 130-135 are used for zoned block devices (see uapi/linux=
/blkzoned.h) */
>
> #define BMAP_IOCTL 1 /* obsolete - kept for compatibility */
> #define FIBMAP _IO(0x00,1) /* bmap access */
> --
> 2.7.4
>
> Western Digital Corporation (and its subsidiaries) E-mail Confidentiality=
Notice & Disclaimer:
>
> This e-mail and any files transmitted with it may contain confidential or=
legally privileged information of WDC and/or its affiliates, and are inten=
ded solely for the use of the individual or entity to which they are addres=
sed. If you are not the intended recipient, any disclosure, copying, distri=
bution or any action taken or omitted to be taken in reliance on it, is pro=
hibited. If you have received this e-mail in error, please notify the sende=
r immediately and delete the e-mail in its entirety from your system.
>
--=20
Shaun Tancheff
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 9/9] blk-zoned: Add ioctl interface for zone operations
@ 2016-09-20 6:02 ` Shaun Tancheff
0 siblings, 0 replies; 36+ messages in thread
From: Shaun Tancheff @ 2016-09-20 6:02 UTC (permalink / raw)
To: Damien Le Moal
Cc: linux-scsi, linux-block, Martin K. Petersen, Jens Axboe, Hannes Reinecke
On Mon, Sep 19, 2016 at 4:27 PM, Damien Le Moal <damien.lemoal@hgst.com> wrote:
> From: Shaun Tancheff <shaun.tancheff@seagate.com>
>
> Adds the new BLKUPDATEZONES, BLKREPORTZONE, BLKRESETZONE,
> BLKOPENZONE, BLKCLOSEZONE and BLKFINISHZONE ioctls.
>
> BLKREPORTZONE implementation uses the device queue zone RB-tree by
> default and no actual command is issued to the device. If the
> application needs access to the untracked zone attributes (non-seq
> flag or reset recommended flag, offline or read-only zone condition,
> etc), BLKUPDATEZONES must be issued first to force an update of the
> cached zone information.
>
> Changelog (Damien):
> * Simplified blkzone descriptor (removed bit-fields and use CPU
> endianness)
> * Changed report ioctl to operate on single zone instead of an
> array of blkzone structures.
I think something with this degree of changes from what
I posted should not include my signed-off-by.
I also really don't like forcing the reply to be a single zone. I
think the user should be able to ask for as many or as few as
they would like.
> Signed-off-by: Shaun Tancheff <shaun.tancheff@seagate.com>
> Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
> ---
> block/blk-zoned.c | 115 ++++++++++++++++++++++++++++++++++++++++++
> block/ioctl.c | 8 +++
> include/linux/blkdev.h | 7 +++
> include/uapi/linux/Kbuild | 1 +
> include/uapi/linux/blkzoned.h | 91 +++++++++++++++++++++++++++++++++
> include/uapi/linux/fs.h | 1 +
> 6 files changed, 223 insertions(+)
> create mode 100644 include/uapi/linux/blkzoned.h
>
> diff --git a/block/blk-zoned.c b/block/blk-zoned.c
> index a107940..71205c8 100644
> --- a/block/blk-zoned.c
> +++ b/block/blk-zoned.c
> @@ -12,6 +12,7 @@
> #include <linux/module.h>
> #include <linux/rbtree.h>
> #include <linux/blkdev.h>
> +#include <linux/blkzoned.h>
>
> void blk_init_zones(struct request_queue *q)
> {
> @@ -336,3 +337,117 @@ int blkdev_finish_zone(struct block_device *bdev,
> return blkdev_issue_zone_action(bdev, sector, REQ_OP_ZONE_FINISH,
> gfp_mask);
> }
> +
> +static int blkdev_report_zone_ioctl(struct block_device *bdev,
> + void __user *argp)
> +{
> + struct blk_zone *zone;
> + struct blkzone z;
> +
> + if (copy_from_user(&z, argp, sizeof(struct blkzone)))
> + return -EFAULT;
> +
> + zone = blk_lookup_zone(bdev_get_queue(bdev), z.start);
> + if (!zone)
> + return -EINVAL;
> +
> + memset(&z, 0, sizeof(struct blkzone));
> +
> + blk_lock_zone(zone);
> +
> + blk_wait_for_zone_update(zone);
> +
> + z.len = zone->len;
> + z.start = zone->start;
> + z.wp = zone->wp;
> + z.type = zone->type;
> + z.cond = zone->cond;
> + z.non_seq = zone->non_seq;
> + z.reset = zone->reset;
> +
> + blk_unlock_zone(zone);
> +
> + if (copy_to_user(argp, &z, sizeof(struct blkzone)))
> + return -EFAULT;
> +
> + return 0;
> +}
> +
> +static int blkdev_zone_action_ioctl(struct block_device *bdev,
> + unsigned cmd, void __user *argp)
> +{
> + unsigned int op;
> + u64 sector;
> +
> + if (get_user(sector, (u64 __user *)argp))
> + return -EFAULT;
> +
> + switch (cmd) {
> + case BLKRESETZONE:
> + op = REQ_OP_ZONE_RESET;
> + break;
> + case BLKOPENZONE:
> + op = REQ_OP_ZONE_OPEN;
> + break;
> + case BLKCLOSEZONE:
> + op = REQ_OP_ZONE_CLOSE;
> + break;
> + case BLKFINISHZONE:
> + op = REQ_OP_ZONE_FINISH;
> + break;
> + }
> +
> + return blkdev_issue_zone_action(bdev, sector, op, GFP_KERNEL);
> +}
> +
> +/**
> + * Called from blkdev_ioctl.
> + */
> +int blkdev_zone_ioctl(struct block_device *bdev, fmode_t mode,
> + unsigned cmd, unsigned long arg)
> +{
> + void __user *argp = (void __user *)arg;
> + struct request_queue *q;
> + int ret;
> +
> + if (!argp)
> + return -EINVAL;
> +
> + q = bdev_get_queue(bdev);
> + if (!q)
> + return -ENXIO;
> +
> + if (!blk_queue_zoned(q))
> + return -ENOTTY;
> +
> + if (!capable(CAP_SYS_ADMIN))
> + return -EACCES;
> +
> + switch (cmd) {
> + case BLKREPORTZONE:
> + ret = blkdev_report_zone_ioctl(bdev, argp);
> + break;
> + case BLKUPDATEZONES:
> + if (!(mode & FMODE_WRITE)) {
> + ret = -EBADF;
> + break;
> + }
> + ret = blkdev_update_zones(bdev, GFP_KERNEL);
> + break;
> + case BLKRESETZONE:
> + case BLKOPENZONE:
> + case BLKCLOSEZONE:
> + case BLKFINISHZONE:
> + if (!(mode & FMODE_WRITE)) {
> + ret = -EBADF;
> + break;
> + }
> + ret = blkdev_zone_action_ioctl(bdev, cmd, argp);
> + break;
> + default:
> + ret = -ENOTTY;
> + break;
> + }
> +
> + return ret;
> +}
> diff --git a/block/ioctl.c b/block/ioctl.c
> index ed2397f..f09679a 100644
> --- a/block/ioctl.c
> +++ b/block/ioctl.c
> @@ -3,6 +3,7 @@
> #include <linux/export.h>
> #include <linux/gfp.h>
> #include <linux/blkpg.h>
> +#include <linux/blkzoned.h>
> #include <linux/hdreg.h>
> #include <linux/backing-dev.h>
> #include <linux/fs.h>
> @@ -513,6 +514,13 @@ int blkdev_ioctl(struct block_device *bdev, fmode_t mode, unsigned cmd,
> BLKDEV_DISCARD_SECURE);
> case BLKZEROOUT:
> return blk_ioctl_zeroout(bdev, mode, arg);
> + case BLKUPDATEZONES:
> + case BLKREPORTZONE:
> + case BLKRESETZONE:
> + case BLKOPENZONE:
> + case BLKCLOSEZONE:
> + case BLKFINISHZONE:
> + return blkdev_zone_ioctl(bdev, mode, cmd, arg);
> case HDIO_GETGEO:
> return blkdev_getgeo(bdev, argp);
> case BLKRAGET:
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index a85f95b..0299d41 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -405,9 +405,16 @@ extern int blkdev_reset_zone(struct block_device *, sector_t, gfp_t);
> extern int blkdev_open_zone(struct block_device *, sector_t, gfp_t);
> extern int blkdev_close_zone(struct block_device *, sector_t, gfp_t);
> extern int blkdev_finish_zone(struct block_device *, sector_t, gfp_t);
> +extern int blkdev_zone_ioctl(struct block_device *, fmode_t, unsigned int,
> + unsigned long);
> #else /* CONFIG_BLK_DEV_ZONED */
> static inline void blk_init_zones(struct request_queue *q) { };
> static inline void blk_drop_zones(struct request_queue *q) { };
> +static inline int blkdev_zone_ioctl(struct block_device *bdev, fmode_t mode,
> + unsigned cmd, unsigned long arg)
> +{
> + return -ENOTTY;
> +}
> #endif /* CONFIG_BLK_DEV_ZONED */
>
> struct request_queue {
> diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild
> index 185f8ea..a2a7522 100644
> --- a/include/uapi/linux/Kbuild
> +++ b/include/uapi/linux/Kbuild
> @@ -70,6 +70,7 @@ header-y += bfs_fs.h
> header-y += binfmts.h
> header-y += blkpg.h
> header-y += blktrace_api.h
> +header-y += blkzoned.h
> header-y += bpf_common.h
> header-y += bpf.h
> header-y += bpqether.h
> diff --git a/include/uapi/linux/blkzoned.h b/include/uapi/linux/blkzoned.h
> new file mode 100644
> index 0000000..23a2702
> --- /dev/null
> +++ b/include/uapi/linux/blkzoned.h
> @@ -0,0 +1,91 @@
> +/*
> + * Zoned block devices handling.
> + *
> + * Copyright (C) 2015 Seagate Technology PLC
> + *
> + * Written by: Shaun Tancheff <shaun.tancheff@seagate.com>
> + *
> + * Modified by: Damien Le Moal <damien.lemoal@hgst.com>
> + * Copyright (C) 2016 Western Digital
> + *
> + * This file is licensed under the terms of the GNU General Public
> + * License version 2. This program is licensed "as is" without any
> + * warranty of any kind, whether express or implied.
> + */
> +#ifndef _UAPI_BLKZONED_H
> +#define _UAPI_BLKZONED_H
> +
> +#include <linux/types.h>
> +#include <linux/ioctl.h>
> +
> +/*
> + * Zone type.
> + */
> +enum blkzone_type {
> + BLKZONE_TYPE_UNKNOWN,
> + BLKZONE_TYPE_CONVENTIONAL,
> + BLKZONE_TYPE_SEQWRITE_REQ,
> + BLKZONE_TYPE_SEQWRITE_PREF,
> +};
> +
> +/*
> + * Zone condition.
> + */
> +enum blkzone_cond {
> + BLKZONE_COND_NO_WP,
> + BLKZONE_COND_EMPTY,
> + BLKZONE_COND_IMP_OPEN,
> + BLKZONE_COND_EXP_OPEN,
> + BLKZONE_COND_CLOSED,
> + BLKZONE_COND_READONLY = 0xd,
> + BLKZONE_COND_FULL,
> + BLKZONE_COND_OFFLINE,
> +};
> +
> +/*
> + * Zone descriptor for BLKREPORTZONE.
> + * start, len and wp use the regulare 512 B sector unit,
> + * regardless of the device logical block size. The overall
> + * structure size is 64 B to match the ZBC/ZAC defined zone descriptor
> + * and allow support for future additional zone information.
> + */
> +struct blkzone {
> + __u64 start; /* Zone start sector */
> + __u64 len; /* Zone length in number of sectors */
> + __u64 wp; /* Zone write pointer position */
> + __u8 type; /* Zone type */
> + __u8 cond; /* Zone condition */
> + __u8 non_seq; /* Non-sequential write resources active */
> + __u8 reset; /* Reset write pointer recommended */
> + __u8 reserved[36];
> +};
> +
> +/*
> + * Zone ioctl's:
> + *
> + * BLKUPDATEZONES : Force update of all zones information
> + * BLKREPORTZONE : Get a zone descriptor. Takes a zone descriptor as
> + * argument. The zone to report is the one
> + * containing the sector initially specified in the
> + * descriptor start field.
> + * BLKRESETZONE : Reset the write pointer of the zone containing the
> + * specified sector, or of all written zones if the
> + * sector is ~0ull.
> + * BLKOPENZONE : Explicitely open the zone containing the
> + * specified sector, or all possible zones if the
> + * sector is ~0ull (the drive determines which zone
> + * to open in this case).
> + * BLKCLOSEZONE : Close the zone containing the specified sector, or
> + * all open zones if the sector is ~0ull.
> + * BLKFINISHZONE : Finish the zone (make it full) containing the
> + * specified sector, or all open and closed zones if
> + * the sector is ~0ull.
> + */
> +#define BLKUPDATEZONES _IO(0x12,130)
> +#define BLKREPORTZONE _IOWR(0x12,131,struct blkzone)
> +#define BLKRESETZONE _IOW(0x12,132,unsigned long long)
> +#define BLKOPENZONE _IOW(0x12,133,unsigned long long)
> +#define BLKCLOSEZONE _IOW(0x12,134,unsigned long long)
> +#define BLKFINISHZONE _IOW(0x12,135,unsigned long long)
> +
> +#endif /* _UAPI_BLKZONED_H */
> diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
> index 3b00f7c..1db6d66 100644
> --- a/include/uapi/linux/fs.h
> +++ b/include/uapi/linux/fs.h
> @@ -222,6 +222,7 @@ struct fsxattr {
> #define BLKSECDISCARD _IO(0x12,125)
> #define BLKROTATIONAL _IO(0x12,126)
> #define BLKZEROOUT _IO(0x12,127)
> +/* A jump here: 130-135 are used for zoned block devices (see uapi/linux/blkzoned.h) */
>
> #define BMAP_IOCTL 1 /* obsolete - kept for compatibility */
> #define FIBMAP _IO(0x00,1) /* bmap access */
> --
> 2.7.4
>
> Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice & Disclaimer:
>
> This e-mail and any files transmitted with it may contain confidential or legally privileged information of WDC and/or its affiliates, and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system.
>
--
Shaun Tancheff
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 9/9] blk-zoned: Add ioctl interface for zone operations
2016-09-19 21:27 ` Damien Le Moal
@ 2016-09-20 6:33 ` kbuild test robot
-1 siblings, 0 replies; 36+ messages in thread
From: kbuild test robot @ 2016-09-20 6:33 UTC (permalink / raw)
To: Damien Le Moal
Cc: kbuild-all, linux-scsi, linux-block, martin.petersen, axboe,
hare, shaun.tancheff, Damien Le Moal
[-- Attachment #1: Type: text/plain, Size: 3551 bytes --]
Hi Shaun,
[auto build test ERROR on linus/master]
[also build test ERROR on v4.8-rc7]
[cannot apply to block/for-next next-20160919]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
[Suggest to use git(>=2.9.0) format-patch --base=<commit> (or --base=auto for convenience) to record what (public, well-known) commit your patch series was built on]
[Check https://git-scm.com/docs/git-format-patch for more information]
url: https://github.com/0day-ci/linux/commits/Damien-Le-Moal/ZBC-Zoned-block-device-support/20160920-062608
config: m32r-allyesconfig (attached as .config)
compiler: m32r-linux-gcc (GCC) 6.2.0
reproduce:
wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=m32r
All errors (new ones prefixed by >>):
block/built-in.o: In function `blkdev_zone_ioctl':
>> (.text+0x394c0): undefined reference to `__get_user_bad'
block/built-in.o: In function `blkdev_zone_ioctl':
(.text+0x394c0): relocation truncated to fit: R_M32R_26_PCREL_RELA against undefined symbol `__get_user_bad'
drivers/built-in.o: In function `nvme_nvm_dev_dma_free':
lightnvm.c:(.text+0x286ae4): undefined reference to `dma_pool_free'
lightnvm.c:(.text+0x286ae4): relocation truncated to fit: R_M32R_26_PCREL_RELA against undefined symbol `dma_pool_free'
drivers/built-in.o: In function `nvme_nvm_dev_dma_alloc':
lightnvm.c:(.text+0x286afc): undefined reference to `dma_pool_alloc'
lightnvm.c:(.text+0x286afc): relocation truncated to fit: R_M32R_26_PCREL_RELA against undefined symbol `dma_pool_alloc'
drivers/built-in.o: In function `nvme_nvm_destroy_dma_pool':
lightnvm.c:(.text+0x286b10): undefined reference to `dma_pool_destroy'
lightnvm.c:(.text+0x286b10): relocation truncated to fit: R_M32R_26_PCREL_RELA against undefined symbol `dma_pool_destroy'
drivers/built-in.o: In function `nvme_nvm_create_dma_pool':
lightnvm.c:(.text+0x286b44): undefined reference to `dma_pool_create'
lightnvm.c:(.text+0x286b44): relocation truncated to fit: R_M32R_26_PCREL_RELA against undefined symbol `dma_pool_create'
sound/built-in.o: In function `snd_pcm_lib_default_mmap':
(.text+0xfcdc): undefined reference to `bad_dma_ops'
sound/built-in.o: In function `snd_pcm_lib_default_mmap':
(.text+0xfce0): undefined reference to `bad_dma_ops'
sound/built-in.o: In function `snd_pcm_lib_default_mmap':
(.text+0xfd30): undefined reference to `dma_common_mmap'
sound/built-in.o: In function `snd_pcm_lib_default_mmap':
(.text+0xfd30): relocation truncated to fit: R_M32R_26_PCREL_RELA against undefined symbol `dma_common_mmap'
sound/built-in.o: In function `cygnus_pcm_preallocate_dma_buffer':
cygnus-pcm.c:(.text+0x1100dc): undefined reference to `bad_dma_ops'
cygnus-pcm.c:(.text+0x1100e0): undefined reference to `bad_dma_ops'
cygnus-pcm.c:(.text+0x110114): undefined reference to `bad_dma_ops'
sound/built-in.o: In function `cygnus_dma_free_dma_buffers':
cygnus-pcm.c:(.text+0x110214): undefined reference to `bad_dma_ops'
cygnus-pcm.c:(.text+0x11021c): undefined reference to `bad_dma_ops'
sound/built-in.o:cygnus-pcm.c:(.text+0x1102b4): more undefined references to `bad_dma_ops' follow
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 36365 bytes --]
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 9/9] blk-zoned: Add ioctl interface for zone operations
@ 2016-09-20 6:33 ` kbuild test robot
0 siblings, 0 replies; 36+ messages in thread
From: kbuild test robot @ 2016-09-20 6:33 UTC (permalink / raw)
Cc: kbuild-all, linux-scsi, linux-block, martin.petersen, axboe,
hare, shaun.tancheff, Damien Le Moal
[-- Attachment #1: Type: text/plain, Size: 3551 bytes --]
Hi Shaun,
[auto build test ERROR on linus/master]
[also build test ERROR on v4.8-rc7]
[cannot apply to block/for-next next-20160919]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
[Suggest to use git(>=2.9.0) format-patch --base=<commit> (or --base=auto for convenience) to record what (public, well-known) commit your patch series was built on]
[Check https://git-scm.com/docs/git-format-patch for more information]
url: https://github.com/0day-ci/linux/commits/Damien-Le-Moal/ZBC-Zoned-block-device-support/20160920-062608
config: m32r-allyesconfig (attached as .config)
compiler: m32r-linux-gcc (GCC) 6.2.0
reproduce:
wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=m32r
All errors (new ones prefixed by >>):
block/built-in.o: In function `blkdev_zone_ioctl':
>> (.text+0x394c0): undefined reference to `__get_user_bad'
block/built-in.o: In function `blkdev_zone_ioctl':
(.text+0x394c0): relocation truncated to fit: R_M32R_26_PCREL_RELA against undefined symbol `__get_user_bad'
drivers/built-in.o: In function `nvme_nvm_dev_dma_free':
lightnvm.c:(.text+0x286ae4): undefined reference to `dma_pool_free'
lightnvm.c:(.text+0x286ae4): relocation truncated to fit: R_M32R_26_PCREL_RELA against undefined symbol `dma_pool_free'
drivers/built-in.o: In function `nvme_nvm_dev_dma_alloc':
lightnvm.c:(.text+0x286afc): undefined reference to `dma_pool_alloc'
lightnvm.c:(.text+0x286afc): relocation truncated to fit: R_M32R_26_PCREL_RELA against undefined symbol `dma_pool_alloc'
drivers/built-in.o: In function `nvme_nvm_destroy_dma_pool':
lightnvm.c:(.text+0x286b10): undefined reference to `dma_pool_destroy'
lightnvm.c:(.text+0x286b10): relocation truncated to fit: R_M32R_26_PCREL_RELA against undefined symbol `dma_pool_destroy'
drivers/built-in.o: In function `nvme_nvm_create_dma_pool':
lightnvm.c:(.text+0x286b44): undefined reference to `dma_pool_create'
lightnvm.c:(.text+0x286b44): relocation truncated to fit: R_M32R_26_PCREL_RELA against undefined symbol `dma_pool_create'
sound/built-in.o: In function `snd_pcm_lib_default_mmap':
(.text+0xfcdc): undefined reference to `bad_dma_ops'
sound/built-in.o: In function `snd_pcm_lib_default_mmap':
(.text+0xfce0): undefined reference to `bad_dma_ops'
sound/built-in.o: In function `snd_pcm_lib_default_mmap':
(.text+0xfd30): undefined reference to `dma_common_mmap'
sound/built-in.o: In function `snd_pcm_lib_default_mmap':
(.text+0xfd30): relocation truncated to fit: R_M32R_26_PCREL_RELA against undefined symbol `dma_common_mmap'
sound/built-in.o: In function `cygnus_pcm_preallocate_dma_buffer':
cygnus-pcm.c:(.text+0x1100dc): undefined reference to `bad_dma_ops'
cygnus-pcm.c:(.text+0x1100e0): undefined reference to `bad_dma_ops'
cygnus-pcm.c:(.text+0x110114): undefined reference to `bad_dma_ops'
sound/built-in.o: In function `cygnus_dma_free_dma_buffers':
cygnus-pcm.c:(.text+0x110214): undefined reference to `bad_dma_ops'
cygnus-pcm.c:(.text+0x11021c): undefined reference to `bad_dma_ops'
sound/built-in.o:cygnus-pcm.c:(.text+0x1102b4): more undefined references to `bad_dma_ops' follow
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 36365 bytes --]
^ permalink raw reply [flat|nested] 36+ messages in thread
end of thread, other threads:[~2016-09-20 6:33 UTC | newest]
Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-19 21:27 [PATCH 0/9] ZBC / Zoned block device support Damien Le Moal
2016-09-19 21:27 ` Damien Le Moal
2016-09-19 21:27 ` [PATCH 1/9] block: Add 'zoned' queue limit Damien Le Moal
2016-09-19 21:27 ` Damien Le Moal
2016-09-20 4:05 ` Bart Van Assche
2016-09-20 4:05 ` Bart Van Assche
2016-09-19 21:27 ` [PATCH 2/9] blk-sysfs: Add 'chunk_sectors' to sysfs attributes Damien Le Moal
2016-09-19 21:27 ` Damien Le Moal
2016-09-19 21:27 ` [PATCH 3/9] block: update chunk_sectors in blk_stack_limits() Damien Le Moal
2016-09-19 21:27 ` Damien Le Moal
2016-09-19 21:27 ` [PATCH 4/9] block: Define zoned block device operations Damien Le Moal
2016-09-19 21:27 ` Damien Le Moal
2016-09-20 4:05 ` Bart Van Assche
2016-09-20 4:05 ` Bart Van Assche
2016-09-19 21:27 ` [PATCH 5/9] block: Implement support for zoned block devices Damien Le Moal
2016-09-19 21:27 ` Damien Le Moal
2016-09-20 4:18 ` Bart Van Assche
2016-09-20 4:18 ` Bart Van Assche
2016-09-19 21:27 ` [PATCH 6/9] block: Add 'BLKPREP_DONE' return value Damien Le Moal
2016-09-19 21:27 ` Damien Le Moal
2016-09-19 21:27 ` [PATCH 7/9] block: Add 'BLK_MQ_RQ_QUEUE_DONE' " Damien Le Moal
2016-09-19 21:27 ` Damien Le Moal
2016-09-19 21:27 ` [PATCH 8/9] sd: Implement support for ZBC devices Damien Le Moal
2016-09-19 21:27 ` Damien Le Moal
2016-09-20 0:08 ` kbuild test robot
2016-09-20 0:08 ` kbuild test robot
2016-09-20 5:40 ` Shaun Tancheff
2016-09-20 5:40 ` Shaun Tancheff
2016-09-19 21:27 ` [PATCH 9/9] blk-zoned: Add ioctl interface for zone operations Damien Le Moal
2016-09-19 21:27 ` Damien Le Moal
2016-09-20 2:39 ` kbuild test robot
2016-09-20 2:39 ` kbuild test robot
2016-09-20 6:02 ` Shaun Tancheff
2016-09-20 6:02 ` Shaun Tancheff
2016-09-20 6:33 ` kbuild test robot
2016-09-20 6:33 ` kbuild test robot
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.