All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V3 00/12] scsi-mq support for ZBC disks
@ 2017-09-15 10:06 Damien Le Moal
  2017-09-15 10:06 ` [PATCH V3 01/12] block: Fix declaration of blk-mq debugfs functions Damien Le Moal
                   ` (11 more replies)
  0 siblings, 12 replies; 48+ messages in thread
From: Damien Le Moal @ 2017-09-15 10:06 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, linux-block, Jens Axboe
  Cc: Christoph Hellwig, Bart Van Assche

This series implements support for ZBC disks used through the scsi-mq I/O path.

The current scsi level support of ZBC disks guarantees write request ordering
using a per-zone write lock which prevents issuing simultaneously multiple
write commands to a zone, doing so avoid reordering of sequential writes to
sequential zones. This method is however ineffective when scsi-mq is used with
zoned block devices. This is due to the different execution model of blk-mq
which passes a request to the scsi layer for dispatching after the request has
been removed from the I/O scheduler queue. That is, when the scsi layer tries
to lock the target zone of the request, the request may already be out of
order and zone write locking fails to prevent that.

Various approaches have been tried to solve this problem. All of them had the
serious disadvantage of cluttering blk-mq code with zoned block device specific
conditions and processing. As such extensive changes can only turn into a
maintenance nightmares, a radically different solution is proposed here.

This series proposes implementing scsi-mq support for zoned block devices in
the form of a new "zoned" I/O scheduler based. The zoned I/O scheduler is
based on mq-deadline, with most of the code reused with no modification other
than name changes. Zoned only differs from mq-deadline from the addition of a
per zone write locking mechanism similar to that implemented in sd_zbc.c. The
zoned scheduler zone write locking mechanism is used for the exact same purpose
as the one in the scsi disk driver, that is, to limit writes per zone to at
most one request to avoid reordering. The locking context however changes from
that of scsi-mq and is moved to the dispatch_request method of the scheduler.
Within this context, under a spin lock guaranteeing atomicity against other
dispatch contexts, target zones of write requests can be locked before write
requests removal from the scheduler. In effect, this results in the same
behavior as the legacy scsi path. Sequential write ordering is preserved.

The zoned scheduler is also optimized to not lock conventional zones. To do so,
the new data structure blk_zoned is added to the request queue structure so
that the low level scsi code can pass information such as number of zones and
zone types to the block layer scheduler. This avoids difficulties in accessing
this information from the I/O scheduler initialization method (init_queue()
method) context.

The series patches are as follows:
- The first 2 patches fix exported symbols declaration to allow compiling
  external I/O schedulers as modules. No functional changes are introduced.
- Patch 3 introduces the new blk_zoned data structure added to request queues
- Pathes 4 to 7 reorganize and cleanup the scsi ZBC support
- Path 8 is a bug fix
- Path 9 implements the initialization of the blk_zoned structure for zoned
  block devices
- Patch 10 is an optimization for the legacy scsi path which avoids zone
  locking if a write request targets a conventional zone. 
- Path 11 disables zone write locking for disks accessed through scsi-mq.
- Patch 12 introduces the new zoned blk-mq I/O scheduler.

Comments are as always very much appreciated.

Changes from v2:
* Introduced blk_zoned structure
* Moved I/O scheduler from drivers/scsi to block 

Changes from v1:
* Addressed Bart's comments for the blk-mq patches (declarations files)
* Split (former) patch 4 into multiple patches to facilitate review
* Fixed scsi disk lookup from io scheduler by introducing
  scsi_disk_from_queue()

Damien Le Moal (12):
  block: Fix declaration of blk-mq debugfs functions
  block: Fix declaration of blk-mq scheduler functions
  block: Add zoned block device information to request queue
  scsi: sd_zbc: Move ZBC declarations to scsi_proto.h
  scsi: sd_zbc: Fix comments and indentation
  scsi: sd_zbc: Rearrange code
  scsi: sd_zbc: Use well defined macros
  scsi: sd_zbc: Fix sd_zbc_read_zoned_characteristics()
  scsi: sd_zbc: Initialize device queue zoned structure
  scsi: sd_zbc: Limit zone write locking to sequential zones
  scsi: sd_zbc: Disable zone write locking with scsi-mq
  block: Introduce zoned I/O scheduler

 Documentation/block/zoned-iosched.txt |  48 ++
 block/Kconfig.iosched                 |  12 +
 block/Makefile                        |   1 +
 block/blk-mq-debugfs.h                |  14 +-
 block/blk-mq-sched.h                  |  11 +-
 block/zoned-iosched.c                 | 925 ++++++++++++++++++++++++++++++++++
 drivers/scsi/scsi_lib.c               |   5 +-
 drivers/scsi/sd_zbc.c                 | 267 +++++++---
 include/linux/blk-mq-debugfs.h        |  23 +
 include/linux/blk-mq-sched.h          |  14 +
 include/linux/blkdev.h                |  24 +
 include/scsi/scsi_proto.h             |  45 +-
 12 files changed, 1283 insertions(+), 106 deletions(-)
 create mode 100644 Documentation/block/zoned-iosched.txt
 create mode 100644 block/zoned-iosched.c
 create mode 100644 include/linux/blk-mq-debugfs.h
 create mode 100644 include/linux/blk-mq-sched.h

-- 
2.13.5

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH V3 01/12] block: Fix declaration of blk-mq debugfs functions
  2017-09-15 10:06 [PATCH V3 00/12] scsi-mq support for ZBC disks Damien Le Moal
@ 2017-09-15 10:06 ` Damien Le Moal
  2017-09-15 14:33     ` Bart Van Assche
  2017-09-15 17:45   ` Christoph Hellwig
  2017-09-15 10:06 ` [PATCH V3 02/12] block: Fix declaration of blk-mq scheduler functions Damien Le Moal
                   ` (10 subsequent siblings)
  11 siblings, 2 replies; 48+ messages in thread
From: Damien Le Moal @ 2017-09-15 10:06 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, linux-block, Jens Axboe
  Cc: Christoph Hellwig, Bart Van Assche

__blk_mq_debugfs_rq_show() and blk_mq_debugfs_rq_show() are exported
symbols but ar eonly declared in the block internal file
block/blk-mq-debugfs.h. which is not cleanly accessible to files outside
of the block directory.
Move the declaration of these functions to the new file
include/linux/blk-mq-debugfs.h file to make the declarations cleanly
available to other modules.

While at it, also move the definition of the blk_mq_debugfs_attr
structure to allow scheduler modules outside of the block directory to
define debugfs attributes.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
---
 block/blk-mq-debugfs.h         | 14 +-------------
 include/linux/blk-mq-debugfs.h | 23 +++++++++++++++++++++++
 2 files changed, 24 insertions(+), 13 deletions(-)
 create mode 100644 include/linux/blk-mq-debugfs.h

diff --git a/block/blk-mq-debugfs.h b/block/blk-mq-debugfs.h
index a182e6f97565..4ffeae84b83a 100644
--- a/block/blk-mq-debugfs.h
+++ b/block/blk-mq-debugfs.h
@@ -3,19 +3,7 @@
 
 #ifdef CONFIG_BLK_DEBUG_FS
 
-#include <linux/seq_file.h>
-
-struct blk_mq_debugfs_attr {
-	const char *name;
-	umode_t mode;
-	int (*show)(void *, struct seq_file *);
-	ssize_t (*write)(void *, const char __user *, size_t, loff_t *);
-	/* Set either .show or .seq_ops. */
-	const struct seq_operations *seq_ops;
-};
-
-int __blk_mq_debugfs_rq_show(struct seq_file *m, struct request *rq);
-int blk_mq_debugfs_rq_show(struct seq_file *m, void *v);
+#include <linux/blk-mq-debugfs.h>
 
 int blk_mq_debugfs_register(struct request_queue *q);
 void blk_mq_debugfs_unregister(struct request_queue *q);
diff --git a/include/linux/blk-mq-debugfs.h b/include/linux/blk-mq-debugfs.h
new file mode 100644
index 000000000000..211537f61dce
--- /dev/null
+++ b/include/linux/blk-mq-debugfs.h
@@ -0,0 +1,23 @@
+#ifndef BLK_MQ_DEBUGFS_H
+#define BLK_MQ_DEBUGFS_H
+
+#ifdef CONFIG_BLK_DEBUG_FS
+
+#include <linux/blkdev.h>
+#include <linux/seq_file.h>
+
+struct blk_mq_debugfs_attr {
+	const char *name;
+	umode_t mode;
+	int (*show)(void *, struct seq_file *);
+	ssize_t (*write)(void *, const char __user *, size_t, loff_t *);
+	/* Set either .show or .seq_ops. */
+	const struct seq_operations *seq_ops;
+};
+
+int __blk_mq_debugfs_rq_show(struct seq_file *m, struct request *rq);
+int blk_mq_debugfs_rq_show(struct seq_file *m, void *v);
+
+#endif
+
+#endif
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH V3 02/12] block: Fix declaration of blk-mq scheduler functions
  2017-09-15 10:06 [PATCH V3 00/12] scsi-mq support for ZBC disks Damien Le Moal
  2017-09-15 10:06 ` [PATCH V3 01/12] block: Fix declaration of blk-mq debugfs functions Damien Le Moal
@ 2017-09-15 10:06 ` Damien Le Moal
  2017-09-15 14:35     ` Bart Van Assche
  2017-09-15 17:46   ` Christoph Hellwig
  2017-09-15 10:06 ` [PATCH V3 03/12] block: Add zoned block device information to request queue Damien Le Moal
                   ` (9 subsequent siblings)
  11 siblings, 2 replies; 48+ messages in thread
From: Damien Le Moal @ 2017-09-15 10:06 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, linux-block, Jens Axboe
  Cc: Christoph Hellwig, Bart Van Assche

The functions blk_mq_sched_free_hctx_data(), blk_mq_sched_try_merge(),
blk_mq_sched_try_insert_merge() and blk_mq_sched_request_inserted() are
all exported symbols but are declared only internally in
block/blk-mq-sched.h. Move these declarations to the new file
include/linux/blk-mq-sched.h to make them available to block scheduler
modules implemented outside of the block directory.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
---
 block/blk-mq-sched.h         | 11 +++--------
 include/linux/blk-mq-sched.h | 14 ++++++++++++++
 2 files changed, 17 insertions(+), 8 deletions(-)
 create mode 100644 include/linux/blk-mq-sched.h

diff --git a/block/blk-mq-sched.h b/block/blk-mq-sched.h
index 9267d0b7c197..f48f3c8edcb3 100644
--- a/block/blk-mq-sched.h
+++ b/block/blk-mq-sched.h
@@ -1,19 +1,14 @@
-#ifndef BLK_MQ_SCHED_H
-#define BLK_MQ_SCHED_H
+#ifndef INT_BLK_MQ_SCHED_H
+#define INT_BLK_MQ_SCHED_H
 
 #include "blk-mq.h"
 #include "blk-mq-tag.h"
 
-void blk_mq_sched_free_hctx_data(struct request_queue *q,
-				 void (*exit)(struct blk_mq_hw_ctx *));
+#include <linux/blk-mq-sched.h>
 
 void blk_mq_sched_assign_ioc(struct request *rq, struct bio *bio);
 
-void blk_mq_sched_request_inserted(struct request *rq);
-bool blk_mq_sched_try_merge(struct request_queue *q, struct bio *bio,
-				struct request **merged_request);
 bool __blk_mq_sched_bio_merge(struct request_queue *q, struct bio *bio);
-bool blk_mq_sched_try_insert_merge(struct request_queue *q, struct request *rq);
 void blk_mq_sched_restart(struct blk_mq_hw_ctx *hctx);
 
 void blk_mq_sched_insert_request(struct request *rq, bool at_head,
diff --git a/include/linux/blk-mq-sched.h b/include/linux/blk-mq-sched.h
new file mode 100644
index 000000000000..8ae1992e02c6
--- /dev/null
+++ b/include/linux/blk-mq-sched.h
@@ -0,0 +1,14 @@
+#ifndef BLK_MQ_SCHED_H
+#define BLK_MQ_SCHED_H
+
+/*
+ * Scheduler helper functions.
+ */
+void blk_mq_sched_free_hctx_data(struct request_queue *q,
+				 void (*exit)(struct blk_mq_hw_ctx *));
+void blk_mq_sched_request_inserted(struct request *rq);
+bool blk_mq_sched_try_merge(struct request_queue *q, struct bio *bio,
+			    struct request **merged_request);
+bool blk_mq_sched_try_insert_merge(struct request_queue *q, struct request *rq);
+
+#endif
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH V3 03/12] block: Add zoned block device information to request queue
  2017-09-15 10:06 [PATCH V3 00/12] scsi-mq support for ZBC disks Damien Le Moal
  2017-09-15 10:06 ` [PATCH V3 01/12] block: Fix declaration of blk-mq debugfs functions Damien Le Moal
  2017-09-15 10:06 ` [PATCH V3 02/12] block: Fix declaration of blk-mq scheduler functions Damien Le Moal
@ 2017-09-15 10:06 ` Damien Le Moal
  2017-09-15 14:38     ` Bart Van Assche
  2017-09-15 17:48   ` Christoph Hellwig
  2017-09-15 10:06 ` [PATCH V3 04/12] scsi: sd_zbc: Move ZBC declarations to scsi_proto.h Damien Le Moal
                   ` (8 subsequent siblings)
  11 siblings, 2 replies; 48+ messages in thread
From: Damien Le Moal @ 2017-09-15 10:06 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, linux-block, Jens Axboe
  Cc: Christoph Hellwig, Bart Van Assche

Components relying only on the requeuest_queue structure for managing
and controlling block devices (e.g. I/O schedulers) have a limited
view/knowledged of the device being controlled. For instance, the device
capacity cannot be known easily, which for a zoned block device also
result in the inability to know the number of zones of the device.

Define the structure blk_zoned and include it as the zoned field of a
request queue to allow low level drivers to provide more information
on the device.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
---
 include/linux/blkdev.h | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 460294bb0fa5..297e0a2fee31 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -349,6 +349,11 @@ struct queue_limits {
 
 #ifdef CONFIG_BLK_DEV_ZONED
 
+struct blk_zoned {
+	unsigned int	nr_zones;
+	unsigned long	*seq_zones;
+};
+
 struct blk_zone_report_hdr {
 	unsigned int	nr_zones;
 	u8		padding[60];
@@ -492,6 +497,10 @@ struct request_queue {
 	struct blk_integrity integrity;
 #endif	/* CONFIG_BLK_DEV_INTEGRITY */
 
+#ifdef CONFIG_BLK_DEV_ZONED
+	struct blk_zoned	zoned;
+#endif
+
 #ifdef CONFIG_PM
 	struct device		*dev;
 	int			rpm_status;
@@ -785,6 +794,11 @@ static inline unsigned int blk_queue_zone_sectors(struct request_queue *q)
 	return blk_queue_is_zoned(q) ? q->limits.chunk_sectors : 0;
 }
 
+static inline unsigned int blk_queue_nr_zones(struct request_queue *q)
+{
+	return blk_queue_is_zoned(q) ? q->zoned.nr_zones : 0;
+}
+
 static inline bool rq_is_sync(struct request *rq)
 {
 	return op_is_sync(rq->cmd_flags);
@@ -1582,6 +1596,16 @@ static inline unsigned int bdev_zone_sectors(struct block_device *bdev)
 	return 0;
 }
 
+static inline unsigned int bdev_nr_zones(struct block_device *bdev)
+{
+	struct request_queue *q = bdev_get_queue(bdev);
+
+	if (q)
+		return blk_queue_nr_zones(q);
+
+	return 0;
+}
+
 static inline int queue_dma_alignment(struct request_queue *q)
 {
 	return q ? q->dma_alignment : 511;
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH V3 04/12] scsi: sd_zbc: Move ZBC declarations to scsi_proto.h
  2017-09-15 10:06 [PATCH V3 00/12] scsi-mq support for ZBC disks Damien Le Moal
                   ` (2 preceding siblings ...)
  2017-09-15 10:06 ` [PATCH V3 03/12] block: Add zoned block device information to request queue Damien Le Moal
@ 2017-09-15 10:06 ` Damien Le Moal
  2017-09-15 17:48   ` Christoph Hellwig
  2017-09-15 10:06 ` [PATCH V3 05/12] scsi: sd_zbc: Fix comments and indentation Damien Le Moal
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 48+ messages in thread
From: Damien Le Moal @ 2017-09-15 10:06 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, linux-block, Jens Axboe
  Cc: Christoph Hellwig, Bart Van Assche

Move standard macro definitions for the zone types and zone conditions
to scsi_proto.h together with the definitions related to the
REPORT ZONES command. While at it, define all values in the enums to
be clear.

Also remove unnecessary includes in sd_zbc.c.

No functional change is introduced by this patch.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Bart Van Assche <Bart.VanAssche@wdc.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
---
 drivers/scsi/sd_zbc.c     | 24 ------------------------
 include/scsi/scsi_proto.h | 45 ++++++++++++++++++++++++++++++++++-----------
 2 files changed, 34 insertions(+), 35 deletions(-)

diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index 8aa54779aac1..692c8cbc7ed8 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -28,32 +28,8 @@
 
 #include <scsi/scsi.h>
 #include <scsi/scsi_cmnd.h>
-#include <scsi/scsi_dbg.h>
-#include <scsi/scsi_device.h>
-#include <scsi/scsi_driver.h>
-#include <scsi/scsi_host.h>
-#include <scsi/scsi_eh.h>
 
 #include "sd.h"
-#include "scsi_priv.h"
-
-enum zbc_zone_type {
-	ZBC_ZONE_TYPE_CONV = 0x1,
-	ZBC_ZONE_TYPE_SEQWRITE_REQ,
-	ZBC_ZONE_TYPE_SEQWRITE_PREF,
-	ZBC_ZONE_TYPE_RESERVED,
-};
-
-enum zbc_zone_cond {
-	ZBC_ZONE_COND_NO_WP,
-	ZBC_ZONE_COND_EMPTY,
-	ZBC_ZONE_COND_IMP_OPEN,
-	ZBC_ZONE_COND_EXP_OPEN,
-	ZBC_ZONE_COND_CLOSED,
-	ZBC_ZONE_COND_READONLY = 0xd,
-	ZBC_ZONE_COND_FULL,
-	ZBC_ZONE_COND_OFFLINE,
-};
 
 /**
  * Convert a zone descriptor to a zone struct.
diff --git a/include/scsi/scsi_proto.h b/include/scsi/scsi_proto.h
index 8c285d9a06d8..39130a9c05bf 100644
--- a/include/scsi/scsi_proto.h
+++ b/include/scsi/scsi_proto.h
@@ -301,19 +301,42 @@ struct scsi_lun {
 
 /* Reporting options for REPORT ZONES */
 enum zbc_zone_reporting_options {
-	ZBC_ZONE_REPORTING_OPTION_ALL = 0,
-	ZBC_ZONE_REPORTING_OPTION_EMPTY,
-	ZBC_ZONE_REPORTING_OPTION_IMPLICIT_OPEN,
-	ZBC_ZONE_REPORTING_OPTION_EXPLICIT_OPEN,
-	ZBC_ZONE_REPORTING_OPTION_CLOSED,
-	ZBC_ZONE_REPORTING_OPTION_FULL,
-	ZBC_ZONE_REPORTING_OPTION_READONLY,
-	ZBC_ZONE_REPORTING_OPTION_OFFLINE,
-	ZBC_ZONE_REPORTING_OPTION_NEED_RESET_WP = 0x10,
-	ZBC_ZONE_REPORTING_OPTION_NON_SEQWRITE,
-	ZBC_ZONE_REPORTING_OPTION_NON_WP = 0x3f,
+	ZBC_ZONE_REPORTING_OPTION_ALL		= 0x00,
+	ZBC_ZONE_REPORTING_OPTION_EMPTY		= 0x01,
+	ZBC_ZONE_REPORTING_OPTION_IMPLICIT_OPEN	= 0x02,
+	ZBC_ZONE_REPORTING_OPTION_EXPLICIT_OPEN	= 0x03,
+	ZBC_ZONE_REPORTING_OPTION_CLOSED	= 0x04,
+	ZBC_ZONE_REPORTING_OPTION_FULL		= 0x05,
+	ZBC_ZONE_REPORTING_OPTION_READONLY	= 0x06,
+	ZBC_ZONE_REPORTING_OPTION_OFFLINE	= 0x07,
+	/* 0x08 to 0x0f are reserved */
+	ZBC_ZONE_REPORTING_OPTION_NEED_RESET_WP	= 0x10,
+	ZBC_ZONE_REPORTING_OPTION_NON_SEQWRITE	= 0x11,
+	/* 0x12 to 0x3e are reserved */
+	ZBC_ZONE_REPORTING_OPTION_NON_WP	= 0x3f,
 };
 
 #define ZBC_REPORT_ZONE_PARTIAL 0x80
 
+/* Zone types of REPORT ZONES zone descriptors */
+enum zbc_zone_type {
+	ZBC_ZONE_TYPE_CONV		= 0x1,
+	ZBC_ZONE_TYPE_SEQWRITE_REQ	= 0x2,
+	ZBC_ZONE_TYPE_SEQWRITE_PREF	= 0x3,
+	/* 0x4 to 0xf are reserved */
+};
+
+/* Zone conditions of REPORT ZONES zone descriptors */
+enum zbc_zone_cond {
+	ZBC_ZONE_COND_NO_WP		= 0x0,
+	ZBC_ZONE_COND_EMPTY		= 0x1,
+	ZBC_ZONE_COND_IMP_OPEN		= 0x2,
+	ZBC_ZONE_COND_EXP_OPEN		= 0x3,
+	ZBC_ZONE_COND_CLOSED		= 0x4,
+	/* 0x5 to 0xc are reserved */
+	ZBC_ZONE_COND_READONLY		= 0xd,
+	ZBC_ZONE_COND_FULL		= 0xe,
+	ZBC_ZONE_COND_OFFLINE		= 0xf,
+};
+
 #endif /* _SCSI_PROTO_H_ */
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH V3 05/12] scsi: sd_zbc: Fix comments and indentation
  2017-09-15 10:06 [PATCH V3 00/12] scsi-mq support for ZBC disks Damien Le Moal
                   ` (3 preceding siblings ...)
  2017-09-15 10:06 ` [PATCH V3 04/12] scsi: sd_zbc: Move ZBC declarations to scsi_proto.h Damien Le Moal
@ 2017-09-15 10:06 ` Damien Le Moal
  2017-09-15 10:44   ` Hannes Reinecke
  2017-09-15 10:06 ` [PATCH V3 06/12] scsi: sd_zbc: Rearrange code Damien Le Moal
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 48+ messages in thread
From: Damien Le Moal @ 2017-09-15 10:06 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, linux-block, Jens Axboe
  Cc: Christoph Hellwig, Bart Van Assche

Fix comments style (do not use documented comment style) and add some
comments to clarify some functions. Also fix some functions signature
indentation and remove a useless blank line in sd_zbc_read_zones().

No functional change is introduced by this patch.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
---
 drivers/scsi/scsi_lib.c |  5 +++-
 drivers/scsi/sd_zbc.c   | 67 ++++++++++++++++++++++++++++++++++++-------------
 2 files changed, 53 insertions(+), 19 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 9cf6a80fe297..c72b97a74906 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1752,7 +1752,10 @@ static void scsi_done(struct scsi_cmnd *cmd)
  *
  * Returns:     Nothing
  *
- * Lock status: IO request lock assumed to be held when called.
+ * Lock status: request queue lock assumed to be held when called.
+ *
+ * Note: See sd_zbc.c sd_zbc_write_lock_zone() for write order
+ * protection for ZBC disks.
  */
 static void scsi_request_fn(struct request_queue *q)
 	__releases(q->queue_lock)
diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index 692c8cbc7ed8..fd29717b6eab 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -31,11 +31,11 @@
 
 #include "sd.h"
 
-/**
- * Convert a zone descriptor to a zone struct.
+/*
+ * Convert a zone descriptor to a struct blk_zone,
+ * taking care of converting LBA sized values to 512B sectors.
  */
-static void sd_zbc_parse_report(struct scsi_disk *sdkp,
-				u8 *buf,
+static void sd_zbc_parse_report(struct scsi_disk *sdkp, u8 *buf,
 				struct blk_zone *zone)
 {
 	struct scsi_device *sdp = sdkp->device;
@@ -57,8 +57,9 @@ static void sd_zbc_parse_report(struct scsi_disk *sdkp,
 		zone->wp = zone->start + zone->len;
 }
 
-/**
+/*
  * Issue a REPORT ZONES scsi command.
+ * For internal use during device validation.
  */
 static int sd_zbc_report_zones(struct scsi_disk *sdkp, unsigned char *buf,
 			       unsigned int buflen, sector_t lba)
@@ -99,6 +100,9 @@ static int sd_zbc_report_zones(struct scsi_disk *sdkp, unsigned char *buf,
 	return 0;
 }
 
+/*
+ * Prepare a REPORT ZONES scsi command for a REQ_OP_ZONE_REPORT request.
+ */
 int sd_zbc_setup_report_cmnd(struct scsi_cmnd *cmd)
 {
 	struct request *rq = cmd->request;
@@ -141,6 +145,11 @@ int sd_zbc_setup_report_cmnd(struct scsi_cmnd *cmd)
 	return BLKPREP_OK;
 }
 
+/*
+ * Process a REPORT ZONES scsi command reply, converting all reported
+ * zone descriptors to struct blk_zone. The conversion is done in-place,
+ * directly in the request specified sg buffer.
+ */
 static void sd_zbc_report_zones_complete(struct scsi_cmnd *scmd,
 					 unsigned int good_bytes)
 {
@@ -196,17 +205,26 @@ static void sd_zbc_report_zones_complete(struct scsi_cmnd *scmd,
 	local_irq_restore(flags);
 }
 
+/*
+ * Zone size in number of 512B sectors.
+ */
 static inline sector_t sd_zbc_zone_sectors(struct scsi_disk *sdkp)
 {
 	return logical_to_sectors(sdkp->device, sdkp->zone_blocks);
 }
 
+/*
+ * Zone number of the specified sector.
+ */
 static inline unsigned int sd_zbc_zone_no(struct scsi_disk *sdkp,
 					  sector_t sector)
 {
 	return sectors_to_logical(sdkp->device, sector) >> sdkp->zone_shift;
 }
 
+/*
+ * Prepare a RESET WRITE POINTER scsi command for a REQ_OP_ZONE_RESET request.
+ */
 int sd_zbc_setup_reset_cmnd(struct scsi_cmnd *cmd)
 {
 	struct request *rq = cmd->request;
@@ -239,6 +257,21 @@ int sd_zbc_setup_reset_cmnd(struct scsi_cmnd *cmd)
 	return BLKPREP_OK;
 }
 
+/*
+ * Write lock a sequential zone. Called from sd_init_cmd() for write requests
+ * (standard write, write same or write zeroes operations).
+ * If the request target zone is not already locked, the zone is locked
+ * and BLKPREP_OK returned, allowing the request to proceed through dispatch in
+ * scsi_request_fn(). Otherwise, BLKPREP_DEFER is returned, forcing the request
+ * to wait for the zone to be unlocked, that is, for the previously issued write
+ * request targeting the same zone to complete.
+ *
+ * This is called from blk_peek_request() context with the
+ * queue lock held and before the request is removed from the scheduler.
+ * As a result, multiple contexts executing concurrently scsi_request_fn()
+ * cannot result in write sequence reordering as only a single write request
+ * per zone is allowed to proceed.
+ */
 int sd_zbc_write_lock_zone(struct scsi_cmnd *cmd)
 {
 	struct request *rq = cmd->request;
@@ -261,10 +294,7 @@ int sd_zbc_write_lock_zone(struct scsi_cmnd *cmd)
 	 * Do not issue more than one write at a time per
 	 * zone. This solves write ordering problems due to
 	 * the unlocking of the request queue in the dispatch
-	 * path in the non scsi-mq case. For scsi-mq, this
-	 * also avoids potential write reordering when multiple
-	 * threads running on different CPUs write to the same
-	 * zone (with a synchronized sequential pattern).
+	 * path in the non scsi-mq case.
 	 */
 	if (sdkp->zones_wlock &&
 	    test_and_set_bit(zno, sdkp->zones_wlock))
@@ -276,6 +306,11 @@ int sd_zbc_write_lock_zone(struct scsi_cmnd *cmd)
 	return BLKPREP_OK;
 }
 
+/*
+ * Write unlock a sequential zone. Called from sd_uninit_cmd().
+ * Unlocking the request target zone will allow dispatching the next
+ * write request for the zone.
+ */
 void sd_zbc_write_unlock_zone(struct scsi_cmnd *cmd)
 {
 	struct request *rq = cmd->request;
@@ -290,8 +325,7 @@ void sd_zbc_write_unlock_zone(struct scsi_cmnd *cmd)
 	}
 }
 
-void sd_zbc_complete(struct scsi_cmnd *cmd,
-		     unsigned int good_bytes,
+void sd_zbc_complete(struct scsi_cmnd *cmd, unsigned int good_bytes,
 		     struct scsi_sense_hdr *sshdr)
 {
 	int result = cmd->result;
@@ -335,7 +369,7 @@ void sd_zbc_complete(struct scsi_cmnd *cmd,
 	}
 }
 
-/**
+/*
  * Read zoned block device characteristics (VPD page B6).
  */
 static int sd_zbc_read_zoned_characteristics(struct scsi_disk *sdkp,
@@ -365,11 +399,10 @@ static int sd_zbc_read_zoned_characteristics(struct scsi_disk *sdkp,
 	return 0;
 }
 
-/**
+/*
  * Check reported capacity.
  */
-static int sd_zbc_check_capacity(struct scsi_disk *sdkp,
-				 unsigned char *buf)
+static int sd_zbc_check_capacity(struct scsi_disk *sdkp, unsigned char *buf)
 {
 	sector_t lba;
 	int ret;
@@ -525,8 +558,7 @@ static int sd_zbc_setup(struct scsi_disk *sdkp)
 	return 0;
 }
 
-int sd_zbc_read_zones(struct scsi_disk *sdkp,
-		      unsigned char *buf)
+int sd_zbc_read_zones(struct scsi_disk *sdkp, unsigned char *buf)
 {
 	int ret;
 
@@ -537,7 +569,6 @@ int sd_zbc_read_zones(struct scsi_disk *sdkp,
 		 */
 		return 0;
 
-
 	/* Get zoned block device characteristics */
 	ret = sd_zbc_read_zoned_characteristics(sdkp, buf);
 	if (ret)
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH V3 06/12] scsi: sd_zbc: Rearrange code
  2017-09-15 10:06 [PATCH V3 00/12] scsi-mq support for ZBC disks Damien Le Moal
                   ` (4 preceding siblings ...)
  2017-09-15 10:06 ` [PATCH V3 05/12] scsi: sd_zbc: Fix comments and indentation Damien Le Moal
@ 2017-09-15 10:06 ` Damien Le Moal
  2017-09-15 10:45   ` Hannes Reinecke
                     ` (2 more replies)
  2017-09-15 10:06 ` [PATCH V3 07/12] scsi: sd_zbc: Use well defined macros Damien Le Moal
                   ` (5 subsequent siblings)
  11 siblings, 3 replies; 48+ messages in thread
From: Damien Le Moal @ 2017-09-15 10:06 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, linux-block, Jens Axboe
  Cc: Christoph Hellwig, Bart Van Assche

Rearrange sd_zbc_setup() to include use_16_for_rw and use_10_for_rw
assignments and move the calculation of sdkp->zone_shift together
with the assignment of the verified zone_blocks value in
sd_zbc_check_zone_size().

No functional change is introduced by this patch.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
---
 drivers/scsi/sd_zbc.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index fd29717b6eab..081bf5a3b10b 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -532,6 +532,7 @@ static int sd_zbc_check_zone_size(struct scsi_disk *sdkp)
 	}
 
 	sdkp->zone_blocks = zone_blocks;
+	sdkp->zone_shift = ilog2(zone_blocks);
 
 	return 0;
 }
@@ -539,10 +540,13 @@ static int sd_zbc_check_zone_size(struct scsi_disk *sdkp)
 static int sd_zbc_setup(struct scsi_disk *sdkp)
 {
 
+	/* READ16/WRITE16 is mandatory for ZBC disks */
+	sdkp->device->use_16_for_rw = 1;
+	sdkp->device->use_10_for_rw = 0;
+
 	/* chunk_sectors indicates the zone size */
 	blk_queue_chunk_sectors(sdkp->disk->queue,
 			logical_to_sectors(sdkp->device, sdkp->zone_blocks));
-	sdkp->zone_shift = ilog2(sdkp->zone_blocks);
 	sdkp->nr_zones = sdkp->capacity >> sdkp->zone_shift;
 	if (sdkp->capacity & (sdkp->zone_blocks - 1))
 		sdkp->nr_zones++;
@@ -605,10 +609,6 @@ int sd_zbc_read_zones(struct scsi_disk *sdkp, unsigned char *buf)
 	if (ret)
 		goto err;
 
-	/* READ16/WRITE16 is mandatory for ZBC disks */
-	sdkp->device->use_16_for_rw = 1;
-	sdkp->device->use_10_for_rw = 0;
-
 	return 0;
 
 err:
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH V3 07/12] scsi: sd_zbc: Use well defined macros
  2017-09-15 10:06 [PATCH V3 00/12] scsi-mq support for ZBC disks Damien Le Moal
                   ` (5 preceding siblings ...)
  2017-09-15 10:06 ` [PATCH V3 06/12] scsi: sd_zbc: Rearrange code Damien Le Moal
@ 2017-09-15 10:06 ` Damien Le Moal
  2017-09-15 10:45   ` Hannes Reinecke
                     ` (2 more replies)
  2017-09-15 10:06 ` [PATCH V3 08/12] scsi: sd_zbc: Fix sd_zbc_read_zoned_characteristics() Damien Le Moal
                   ` (4 subsequent siblings)
  11 siblings, 3 replies; 48+ messages in thread
From: Damien Le Moal @ 2017-09-15 10:06 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, linux-block, Jens Axboe
  Cc: Christoph Hellwig, Bart Van Assche

instead of open coding, use the min() macro to calculate a report zones
reply buffer length in sd_zbc_check_zone_size() and the round_up()
macro for calculating the number of zones in sd_zbc_setup().

No functional change is introduced by this patch.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
---
 drivers/scsi/sd_zbc.c | 12 ++++--------
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index 081bf5a3b10b..37b33adbb2ef 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -430,7 +430,7 @@ static int sd_zbc_check_capacity(struct scsi_disk *sdkp, unsigned char *buf)
 	return 0;
 }
 
-#define SD_ZBC_BUF_SIZE 131072
+#define SD_ZBC_BUF_SIZE 131072U
 
 static int sd_zbc_check_zone_size(struct scsi_disk *sdkp)
 {
@@ -474,10 +474,7 @@ static int sd_zbc_check_zone_size(struct scsi_disk *sdkp)
 		/* Parse REPORT ZONES header */
 		list_length = get_unaligned_be32(&buf[0]) + 64;
 		rec = buf + 64;
-		if (list_length < SD_ZBC_BUF_SIZE)
-			buf_len = list_length;
-		else
-			buf_len = SD_ZBC_BUF_SIZE;
+		buf_len = min(list_length, SD_ZBC_BUF_SIZE);
 
 		/* Parse zone descriptors */
 		while (rec < buf + buf_len) {
@@ -547,9 +544,8 @@ static int sd_zbc_setup(struct scsi_disk *sdkp)
 	/* chunk_sectors indicates the zone size */
 	blk_queue_chunk_sectors(sdkp->disk->queue,
 			logical_to_sectors(sdkp->device, sdkp->zone_blocks));
-	sdkp->nr_zones = sdkp->capacity >> sdkp->zone_shift;
-	if (sdkp->capacity & (sdkp->zone_blocks - 1))
-		sdkp->nr_zones++;
+	sdkp->nr_zones =
+		round_up(sdkp->capacity, sdkp->zone_blocks) >> sdkp->zone_shift;
 
 	if (!sdkp->zones_wlock) {
 		sdkp->zones_wlock = kcalloc(BITS_TO_LONGS(sdkp->nr_zones),
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH V3 08/12] scsi: sd_zbc: Fix sd_zbc_read_zoned_characteristics()
  2017-09-15 10:06 [PATCH V3 00/12] scsi-mq support for ZBC disks Damien Le Moal
                   ` (6 preceding siblings ...)
  2017-09-15 10:06 ` [PATCH V3 07/12] scsi: sd_zbc: Use well defined macros Damien Le Moal
@ 2017-09-15 10:06 ` Damien Le Moal
  2017-09-15 10:46   ` Hannes Reinecke
  2017-09-15 17:52   ` Christoph Hellwig
  2017-09-15 10:06 ` [PATCH V3 09/12] scsi: sd_zbc: Initialize device queue zoned structure Damien Le Moal
                   ` (3 subsequent siblings)
  11 siblings, 2 replies; 48+ messages in thread
From: Damien Le Moal @ 2017-09-15 10:06 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, linux-block, Jens Axboe
  Cc: Christoph Hellwig, Bart Van Assche, stable

The three values starting at byte 8 of the Zoned Block Device
Characteristics VPD page B6h are 32 bits values, not 64bits. So use
get_unaligned_be32() to retrieve the values and not get_unaligned_be64()

Fixes: 89d947561077 ("sd: Implement support for ZBC devices")
Cc: <stable@vger.kernel.org>

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Bart Van Assche <Bart.VanAssche@wdc.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
---
 drivers/scsi/sd_zbc.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index 37b33adbb2ef..7e30d0c22930 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -385,15 +385,15 @@ static int sd_zbc_read_zoned_characteristics(struct scsi_disk *sdkp,
 	if (sdkp->device->type != TYPE_ZBC) {
 		/* Host-aware */
 		sdkp->urswrz = 1;
-		sdkp->zones_optimal_open = get_unaligned_be64(&buf[8]);
-		sdkp->zones_optimal_nonseq = get_unaligned_be64(&buf[12]);
+		sdkp->zones_optimal_open = get_unaligned_be32(&buf[8]);
+		sdkp->zones_optimal_nonseq = get_unaligned_be32(&buf[12]);
 		sdkp->zones_max_open = 0;
 	} else {
 		/* Host-managed */
 		sdkp->urswrz = buf[4] & 1;
 		sdkp->zones_optimal_open = 0;
 		sdkp->zones_optimal_nonseq = 0;
-		sdkp->zones_max_open = get_unaligned_be64(&buf[16]);
+		sdkp->zones_max_open = get_unaligned_be32(&buf[16]);
 	}
 
 	return 0;
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH V3 09/12] scsi: sd_zbc: Initialize device queue zoned structure
  2017-09-15 10:06 [PATCH V3 00/12] scsi-mq support for ZBC disks Damien Le Moal
                   ` (7 preceding siblings ...)
  2017-09-15 10:06 ` [PATCH V3 08/12] scsi: sd_zbc: Fix sd_zbc_read_zoned_characteristics() Damien Le Moal
@ 2017-09-15 10:06 ` Damien Le Moal
  2017-09-15 10:49   ` Hannes Reinecke
  2017-09-15 10:06 ` [PATCH V3 10/12] scsi: sd_zbc: Limit zone write locking to sequential zones Damien Le Moal
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 48+ messages in thread
From: Damien Le Moal @ 2017-09-15 10:06 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, linux-block, Jens Axboe
  Cc: Christoph Hellwig, Bart Van Assche

Allocate and initialize the disk request queue zoned structure on disk
revalidate. As the bitmap allocation for the seq_zones field of the
zoned structure is identical to the allocation of the zones write lock
bitmap, introduce the helper sd_zbc_alloc_zone_bitmap().
Using this helper, wait for the disk capacity and number of zones to
stabilize on the second revalidation pass to allocate and initialize
the bitmaps.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
---
 drivers/scsi/sd_zbc.c | 119 +++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 114 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index 7e30d0c22930..534e747e8bcf 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -534,8 +534,103 @@ static int sd_zbc_check_zone_size(struct scsi_disk *sdkp)
 	return 0;
 }
 
+/*
+ * Allocate a zone bitmap (one bit per zone).
+ */
+static inline unsigned long *sd_zbc_alloc_zone_bitmap(struct scsi_disk *sdkp)
+{
+	struct request_queue *q = sdkp->disk->queue;
+
+	return kzalloc_node(BITS_TO_LONGS(sdkp->nr_zones)
+			    * sizeof(unsigned long),
+			    GFP_KERNEL, q->node);
+}
+
+/*
+ * Initialize the seq_zones bitmap to allow identifying sequential zones.
+ */
+static int sd_zbc_setup_seq_zones(struct scsi_disk *sdkp)
+{
+	struct request_queue *q = sdkp->disk->queue;
+	unsigned long *seq_zones;
+	sector_t block = 0;
+	unsigned char *buf;
+	unsigned char *rec;
+	unsigned int buf_len;
+	unsigned int list_length;
+	unsigned int n = 0;
+	u8 type, cond;
+	int ret = -ENOMEM;
+
+	seq_zones = sd_zbc_alloc_zone_bitmap(sdkp);
+	if (!seq_zones)
+		return -ENOMEM;
+
+	buf = kmalloc(SD_ZBC_BUF_SIZE, GFP_KERNEL);
+	if (!buf)
+		goto out;
+
+	while (block < sdkp->capacity) {
+
+		ret = sd_zbc_report_zones(sdkp, buf, SD_ZBC_BUF_SIZE, block);
+		if (ret)
+			goto out;
+
+		/*
+		 * Parse reported zone descriptors to find sequiential zones.
+		 * Since read-only and offline zones cannot be written, do not
+		 * mark them as sequential in the bitmap.
+		 */
+		list_length = get_unaligned_be32(&buf[0]) + 64;
+		rec = buf + 64;
+		buf_len = min(list_length, SD_ZBC_BUF_SIZE);
+		while (rec < buf + buf_len) {
+			type = rec[0] & 0x0f;
+			cond = (rec[1] >> 4) & 0xf;
+			if (type != ZBC_ZONE_TYPE_CONV &&
+			    cond != ZBC_ZONE_COND_READONLY &&
+			    cond != ZBC_ZONE_COND_OFFLINE)
+				set_bit(n, seq_zones);
+			block = get_unaligned_be64(&rec[8]) +
+				get_unaligned_be64(&rec[16]);
+			rec += 64;
+			n++;
+		}
+
+	}
+
+	if (n != sdkp->nr_zones) {
+		/* Something was wrong */
+		ret = -EIO;
+	}
+
+out:
+	kfree(buf);
+	if (ret) {
+		kfree(seq_zones);
+		return ret;
+	}
+
+	q->zoned.seq_zones = seq_zones;
+
+	return 0;
+}
+
+static void sd_zbc_cleanup(struct scsi_disk *sdkp)
+{
+	struct request_queue *q = sdkp->disk->queue;
+
+	kfree(q->zoned.seq_zones);
+	q->zoned.seq_zones = NULL;
+
+	kfree(sdkp->zones_wlock);
+	sdkp->zones_wlock = NULL;
+}
+
 static int sd_zbc_setup(struct scsi_disk *sdkp)
 {
+	struct request_queue *q = sdkp->disk->queue;
+	int ret;
 
 	/* READ16/WRITE16 is mandatory for ZBC disks */
 	sdkp->device->use_16_for_rw = 1;
@@ -547,14 +642,28 @@ static int sd_zbc_setup(struct scsi_disk *sdkp)
 	sdkp->nr_zones =
 		round_up(sdkp->capacity, sdkp->zone_blocks) >> sdkp->zone_shift;
 
+	/*
+	 * Wait for the disk capacity to stabilize before
+	 * initializing zone related information.
+	 */
+	if (sdkp->first_scan)
+		return 0;
+
 	if (!sdkp->zones_wlock) {
-		sdkp->zones_wlock = kcalloc(BITS_TO_LONGS(sdkp->nr_zones),
-					    sizeof(unsigned long),
-					    GFP_KERNEL);
+		sdkp->zones_wlock = sd_zbc_alloc_zone_bitmap(sdkp);
 		if (!sdkp->zones_wlock)
 			return -ENOMEM;
 	}
 
+	q->zoned.nr_zones = sdkp->nr_zones;
+	if (!q->zoned.seq_zones) {
+		ret = sd_zbc_setup_seq_zones(sdkp);
+		if (ret) {
+			sd_zbc_cleanup(sdkp);
+			return ret;
+		}
+	}
+
 	return 0;
 }
 
@@ -609,14 +718,14 @@ int sd_zbc_read_zones(struct scsi_disk *sdkp, unsigned char *buf)
 
 err:
 	sdkp->capacity = 0;
+	sd_zbc_cleanup(sdkp);
 
 	return ret;
 }
 
 void sd_zbc_remove(struct scsi_disk *sdkp)
 {
-	kfree(sdkp->zones_wlock);
-	sdkp->zones_wlock = NULL;
+	sd_zbc_cleanup(sdkp);
 }
 
 void sd_zbc_print_zones(struct scsi_disk *sdkp)
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH V3 10/12] scsi: sd_zbc: Limit zone write locking to sequential zones
  2017-09-15 10:06 [PATCH V3 00/12] scsi-mq support for ZBC disks Damien Le Moal
                   ` (8 preceding siblings ...)
  2017-09-15 10:06 ` [PATCH V3 09/12] scsi: sd_zbc: Initialize device queue zoned structure Damien Le Moal
@ 2017-09-15 10:06 ` Damien Le Moal
  2017-09-15 10:52   ` Hannes Reinecke
  2017-09-15 16:23     ` Bart Van Assche
  2017-09-15 10:06 ` [PATCH V3 11/12] scsi: sd_zbc: Disable zone write locking with scsi-mq Damien Le Moal
  2017-09-15 10:06 ` [PATCH V3 12/12] block: Introduce zoned I/O scheduler Damien Le Moal
  11 siblings, 2 replies; 48+ messages in thread
From: Damien Le Moal @ 2017-09-15 10:06 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, linux-block, Jens Axboe
  Cc: Christoph Hellwig, Bart Van Assche

Zoned block devices have no write constraints for conventional zones.
So write locking of conventional zones is not necessary and can even
hurt performance by unnecessarily operating the disk under low queue
depth. To avoid this, use the disk request queue seq_zones bitmap to
allow any write to be issued to conventional zones, locking only
sequential zones.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
---
 drivers/scsi/sd_zbc.c | 20 ++++++++++++--------
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index 534e747e8bcf..c32a3a3d9138 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -275,6 +275,7 @@ int sd_zbc_setup_reset_cmnd(struct scsi_cmnd *cmd)
 int sd_zbc_write_lock_zone(struct scsi_cmnd *cmd)
 {
 	struct request *rq = cmd->request;
+	struct request_queue *q = rq->q;
 	struct scsi_disk *sdkp = scsi_disk(rq->rq_disk);
 	sector_t sector = blk_rq_pos(rq);
 	sector_t zone_sectors = sd_zbc_zone_sectors(sdkp);
@@ -286,18 +287,20 @@ int sd_zbc_write_lock_zone(struct scsi_cmnd *cmd)
 	 */
 
 	/* Do not allow zone boundaries crossing on host-managed drives */
-	if (blk_queue_zoned_model(sdkp->disk->queue) == BLK_ZONED_HM &&
+	if (blk_queue_zoned_model(q) == BLK_ZONED_HM &&
 	    (sector & (zone_sectors - 1)) + blk_rq_sectors(rq) > zone_sectors)
 		return BLKPREP_KILL;
 
 	/*
-	 * Do not issue more than one write at a time per
-	 * zone. This solves write ordering problems due to
-	 * the unlocking of the request queue in the dispatch
-	 * path in the non scsi-mq case.
+	 * There is no write constraints on conventional zones. So any write
+	 * command can be sent. But do not issue more than one write command
+	 * at a time per sequential zone. This avoids write ordering problems
+	 * due to the unlocking of the request queue in the dispatch path of
+	 * legacy scsi path, as well as at the HBA level (e.g. AHCI).
 	 */
-	if (sdkp->zones_wlock &&
-	    test_and_set_bit(zno, sdkp->zones_wlock))
+	if (q->zoned.seq_zones && test_bit(zno, q->zoned.seq_zones))
+		return BLKPREP_OK;
+	if (sdkp->zones_wlock && test_and_set_bit(zno, sdkp->zones_wlock))
 		return BLKPREP_DEFER;
 
 	WARN_ON_ONCE(cmd->flags & SCMD_ZONE_WRITE_LOCK);
@@ -316,8 +319,9 @@ void sd_zbc_write_unlock_zone(struct scsi_cmnd *cmd)
 	struct request *rq = cmd->request;
 	struct scsi_disk *sdkp = scsi_disk(rq->rq_disk);
 
-	if (sdkp->zones_wlock && cmd->flags & SCMD_ZONE_WRITE_LOCK) {
+	if (cmd->flags & SCMD_ZONE_WRITE_LOCK) {
 		unsigned int zno = sd_zbc_zone_no(sdkp, blk_rq_pos(rq));
+
 		WARN_ON_ONCE(!test_bit(zno, sdkp->zones_wlock));
 		cmd->flags &= ~SCMD_ZONE_WRITE_LOCK;
 		clear_bit_unlock(zno, sdkp->zones_wlock);
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH V3 11/12] scsi: sd_zbc: Disable zone write locking with scsi-mq
  2017-09-15 10:06 [PATCH V3 00/12] scsi-mq support for ZBC disks Damien Le Moal
                   ` (9 preceding siblings ...)
  2017-09-15 10:06 ` [PATCH V3 10/12] scsi: sd_zbc: Limit zone write locking to sequential zones Damien Le Moal
@ 2017-09-15 10:06 ` Damien Le Moal
  2017-09-15 10:52   ` Hannes Reinecke
  2017-09-15 10:06 ` [PATCH V3 12/12] block: Introduce zoned I/O scheduler Damien Le Moal
  11 siblings, 1 reply; 48+ messages in thread
From: Damien Le Moal @ 2017-09-15 10:06 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, linux-block, Jens Axboe
  Cc: Christoph Hellwig, Bart Van Assche

In the case of a ZBC disk used with scsi-mq, zone write locking does
not prevent write reordering in sequential zones. Unlike the legacy
case, zone locking is done after the command request is removed from
the scheduler dispatch queue. That is, at the time of zone locking,
the write command may already be out of order, making locking
ineffective. Write order guarantees can only be provided by an
adapted I/O scheduler.

Disable zone write locking in sd_zbc_write_lock_zone() if the disk is
used with scsi-mq. As the disk zones_wlock bitmap is not necessry,
do not allocate it.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Bart Van Assche <Bart.VanAssche@wdc.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
---
 drivers/scsi/sd_zbc.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index c32a3a3d9138..de504026f2c0 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -279,7 +279,7 @@ int sd_zbc_write_lock_zone(struct scsi_cmnd *cmd)
 	struct scsi_disk *sdkp = scsi_disk(rq->rq_disk);
 	sector_t sector = blk_rq_pos(rq);
 	sector_t zone_sectors = sd_zbc_zone_sectors(sdkp);
-	unsigned int zno = sd_zbc_zone_no(sdkp, sector);
+	unsigned int zno;
 
 	/*
 	 * Note: Checks of the alignment of the write command on
@@ -291,6 +291,10 @@ int sd_zbc_write_lock_zone(struct scsi_cmnd *cmd)
 	    (sector & (zone_sectors - 1)) + blk_rq_sectors(rq) > zone_sectors)
 		return BLKPREP_KILL;
 
+	/* No write locking with scsi-mq */
+	if (q->mq_ops)
+		return BLKPREP_OK;
+
 	/*
 	 * There is no write constraints on conventional zones. So any write
 	 * command can be sent. But do not issue more than one write command
@@ -298,6 +302,7 @@ int sd_zbc_write_lock_zone(struct scsi_cmnd *cmd)
 	 * due to the unlocking of the request queue in the dispatch path of
 	 * legacy scsi path, as well as at the HBA level (e.g. AHCI).
 	 */
+	zno = sd_zbc_zone_no(sdkp, sector);
 	if (q->zoned.seq_zones && test_bit(zno, q->zoned.seq_zones))
 		return BLKPREP_OK;
 	if (sdkp->zones_wlock && test_and_set_bit(zno, sdkp->zones_wlock))
@@ -653,7 +658,7 @@ static int sd_zbc_setup(struct scsi_disk *sdkp)
 	if (sdkp->first_scan)
 		return 0;
 
-	if (!sdkp->zones_wlock) {
+	if (!q->mq_ops && !sdkp->zones_wlock) {
 		sdkp->zones_wlock = sd_zbc_alloc_zone_bitmap(sdkp);
 		if (!sdkp->zones_wlock)
 			return -ENOMEM;
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH V3 12/12] block: Introduce zoned I/O scheduler
  2017-09-15 10:06 [PATCH V3 00/12] scsi-mq support for ZBC disks Damien Le Moal
                   ` (10 preceding siblings ...)
  2017-09-15 10:06 ` [PATCH V3 11/12] scsi: sd_zbc: Disable zone write locking with scsi-mq Damien Le Moal
@ 2017-09-15 10:06 ` Damien Le Moal
  2017-09-15 10:55   ` Hannes Reinecke
  11 siblings, 1 reply; 48+ messages in thread
From: Damien Le Moal @ 2017-09-15 10:06 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, linux-block, Jens Axboe
  Cc: Christoph Hellwig, Bart Van Assche

The zoned I/O scheduler is mostly identical to mq-deadline and retains
the same configuration attributes. The main difference is that the
zoned scheduler will ensure that at any time at most one write request
per sequential zone is in flight (has been dispatched to the disk) in
order to protect against potential sequential write reordering
resulting from the concurrent execution of request dispatch by multiple
contexts.

This is achieved similarly to the legacy SCSI I/O path by write locking
zones when a write requests is issued. Subsequent writes to the same
zone have to wait for the completion of the previously issued write
before being in turn dispatched to the disk. This ensures that
sequential writes are processed in the correct order without needing
any modification to the execution model of blk-mq. In addition, this
also protects against write reordering at the HBA level (e.g. AHCI).

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
---
 Documentation/block/zoned-iosched.txt |  48 ++
 block/Kconfig.iosched                 |  12 +
 block/Makefile                        |   1 +
 block/zoned-iosched.c                 | 925 ++++++++++++++++++++++++++++++++++
 4 files changed, 986 insertions(+)
 create mode 100644 Documentation/block/zoned-iosched.txt
 create mode 100644 block/zoned-iosched.c

diff --git a/Documentation/block/zoned-iosched.txt b/Documentation/block/zoned-iosched.txt
new file mode 100644
index 000000000000..b269125bdc61
--- /dev/null
+++ b/Documentation/block/zoned-iosched.txt
@@ -0,0 +1,48 @@
+Zoned I/O scheduler
+===================
+
+The Zoned I/O scheduler solves zoned block devices write ordering problems
+inherent to the absence of a global request queue lock in the blk-mq
+infrastructure. Multiple contexts may try to dispatch simultaneously write
+requests to the same sequential zone of a zoned block device, doing so
+potentially breaking the sequential write order imposed by the device.
+
+The Zoned I/O scheduler is based on the mq-deadline scheduler. It shares the
+same tunables and behaves in a comparable manner. The main difference introduced
+with the zoned scheduler is handling of write batches. Whereas mq-deadline will
+keep dispatching write requests to the device as long as the batching size
+allows and reads are not starved, the zoned scheduler introduces additional
+constraints:
+1) At most only one write request can be issued to a sequential zone of the
+device. This ensures that no reordering of sequential writes to a sequential
+zone can happen once the write request leaves the scheduler internal queue (rb
+tree).
+2) The number of sequential zones that can be simultaneously written is limited
+to the device advertized maximum number of open zones. This additional condition
+avoids performance degradation due to excessive open zone resource use at the
+device level.
+
+These conditions do not apply to write requests targeting conventional zones.
+For these, the zoned scheduler behaves exactly like the mq-deadline scheduler.
+
+The zoned I/O scheduler cannot be used with regular block devices. It can only
+be used with host-managed or host-aware zoned block devices.
+Using the zoned I/O scheduler is mandatory with host-managed disks unless the
+disk user tightly controls itself write sequencing to sequential zones. The
+zoned scheduler will treat host-aware disks exactly the same way as host-managed
+devices. That is, eventhough host aware disks can be randomly written, the zoned
+scheduler will still impose the limit to one write per zone so that sequential
+writes sequences are preserved.
+
+For host-managed disks, automating the used of the zoned scheduler can be done
+simply with a udev rule. An example of such rule is shown below.
+
+# Set zoned scheduler for host-managed zoned block devices
+ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/zoned}=="host-managed", \
+	ATTR{queue/scheduler}="zoned"
+
+Zoned I/O scheduler tunables
+============================
+
+Tunables of the Zoned I/O scheduler are identical to those of the deadline
+I/O scheduler. See Documentation/block/deadline-iosched.txt for details.
diff --git a/block/Kconfig.iosched b/block/Kconfig.iosched
index fd2cefa47d35..b87c67dbf1f6 100644
--- a/block/Kconfig.iosched
+++ b/block/Kconfig.iosched
@@ -70,6 +70,18 @@ config MQ_IOSCHED_DEADLINE
 	---help---
 	  MQ version of the deadline IO scheduler.
 
+config MQ_IOSCHED_ZONED
+	tristate "Zoned I/O scheduler"
+	depends on BLK_DEV_ZONED
+	depends on SCSI_MOD
+	depends on BLK_DEV_SD
+	default y
+	---help---
+	  MQ deadline IO scheduler with support for zoned block devices.
+
+	  This should be set as the default I/O scheduler for host-managed
+	  zoned block devices. It is optional for host-aware block devices.
+
 config MQ_IOSCHED_KYBER
 	tristate "Kyber I/O scheduler"
 	default y
diff --git a/block/Makefile b/block/Makefile
index 9396ebc85d24..8f35c97a0199 100644
--- a/block/Makefile
+++ b/block/Makefile
@@ -20,6 +20,7 @@ obj-$(CONFIG_IOSCHED_NOOP)	+= noop-iosched.o
 obj-$(CONFIG_IOSCHED_DEADLINE)	+= deadline-iosched.o
 obj-$(CONFIG_IOSCHED_CFQ)	+= cfq-iosched.o
 obj-$(CONFIG_MQ_IOSCHED_DEADLINE)	+= mq-deadline.o
+obj-$(CONFIG_MQ_IOSCHED_ZONED)	+= zoned-iosched.o
 obj-$(CONFIG_MQ_IOSCHED_KYBER)	+= kyber-iosched.o
 bfq-y				:= bfq-iosched.o bfq-wf2q.o bfq-cgroup.o
 obj-$(CONFIG_IOSCHED_BFQ)	+= bfq.o
diff --git a/block/zoned-iosched.c b/block/zoned-iosched.c
new file mode 100644
index 000000000000..0615c200f3f4
--- /dev/null
+++ b/block/zoned-iosched.c
@@ -0,0 +1,925 @@
+/*
+ *  Zoned MQ Deadline i/o scheduler - adaptation of the MQ deadline scheduler,
+ *  for zoned block devices used with the blk-mq scheduling framework
+ *
+ *  Copyright (C) 2016 Jens Axboe <axboe@kernel.dk>
+ *  Copyright (C) 2017 Damien Le Moal <damien.lemoal@wdc.com>
+ */
+#include <linux/kernel.h>
+#include <linux/fs.h>
+#include <linux/blkdev.h>
+#include <linux/blk-mq.h>
+#include <linux/blk-mq-sched.h>
+#include <linux/blk-mq-debugfs.h>
+#include <linux/elevator.h>
+#include <linux/bio.h>
+#include <linux/module.h>
+#include <linux/slab.h>
+#include <linux/init.h>
+#include <linux/compiler.h>
+#include <linux/rbtree.h>
+#include <linux/sbitmap.h>
+#include <linux/seq_file.h>
+
+/*
+ * See Documentation/block/deadline-iosched.txt
+ */
+
+/* max time before a read is submitted. */
+static const int read_expire = HZ / 2;
+
+/* ditto for writes, these limits are SOFT! */
+static const int write_expire = 5 * HZ;
+
+/* max times reads can starve a write */
+static const int writes_starved = 2;
+
+/*
+ * Number of sequential requests treated as one by the above parameters.
+ * For throughput.
+ */
+static const int fifo_batch = 16;
+
+/*
+ * Run time data.
+ */
+struct zoned_data {
+	/*
+	 * requests (zoned_rq s) are present on both sort_list and fifo_list
+	 */
+	struct rb_root sort_list[2];
+	struct list_head fifo_list[2];
+
+	/*
+	 * next in sort order. read, write or both are NULL
+	 */
+	struct request *next_rq[2];
+	unsigned int batching;		/* number of sequential requests made */
+	unsigned int starved;		/* times reads have starved writes */
+
+	/*
+	 * settings that change how the i/o scheduler behaves
+	 */
+	int fifo_expire[2];
+	int fifo_batch;
+	int writes_starved;
+	int front_merges;
+
+	spinlock_t lock;
+	struct list_head dispatch;
+
+	unsigned int zone_sectors_shift;
+	spinlock_t zones_lock;
+	unsigned long *zones_wlock;
+};
+
+static inline struct rb_root *
+zoned_rb_root(struct zoned_data *zd, struct request *rq)
+{
+	return &zd->sort_list[rq_data_dir(rq)];
+}
+
+/*
+ * get the request after `rq' in sector-sorted order
+ */
+static inline struct request *
+zoned_latter_request(struct request *rq)
+{
+	struct rb_node *node = rb_next(&rq->rb_node);
+
+	if (node)
+		return rb_entry_rq(node);
+
+	return NULL;
+}
+
+static void
+zoned_add_rq_rb(struct zoned_data *zd, struct request *rq)
+{
+	struct rb_root *root = zoned_rb_root(zd, rq);
+
+	elv_rb_add(root, rq);
+}
+
+static inline void
+zoned_del_rq_rb(struct zoned_data *zd, struct request *rq)
+{
+	const int data_dir = rq_data_dir(rq);
+
+	if (zd->next_rq[data_dir] == rq)
+		zd->next_rq[data_dir] = zoned_latter_request(rq);
+
+	elv_rb_del(zoned_rb_root(zd, rq), rq);
+}
+
+/*
+ * remove rq from rbtree and fifo.
+ */
+static void zoned_remove_request(struct request_queue *q, struct request *rq)
+{
+	struct zoned_data *zd = q->elevator->elevator_data;
+
+	list_del_init(&rq->queuelist);
+
+	/*
+	 * We might not be on the rbtree, if we are doing an insert merge
+	 */
+	if (!RB_EMPTY_NODE(&rq->rb_node))
+		zoned_del_rq_rb(zd, rq);
+
+	elv_rqhash_del(q, rq);
+	if (q->last_merge == rq)
+		q->last_merge = NULL;
+}
+
+static void zd_request_merged(struct request_queue *q, struct request *req,
+			      enum elv_merge type)
+{
+	struct zoned_data *zd = q->elevator->elevator_data;
+
+	/*
+	 * if the merge was a front merge, we need to reposition request
+	 */
+	if (type == ELEVATOR_FRONT_MERGE) {
+		elv_rb_del(zoned_rb_root(zd, req), req);
+		zoned_add_rq_rb(zd, req);
+	}
+}
+
+static void zd_merged_requests(struct request_queue *q, struct request *req,
+			       struct request *next)
+{
+	/*
+	 * if next expires before rq, assign its expire time to rq
+	 * and move into next position (next will be deleted) in fifo
+	 */
+	if (!list_empty(&req->queuelist) && !list_empty(&next->queuelist)) {
+		if (time_before((unsigned long)next->fifo_time,
+				(unsigned long)req->fifo_time)) {
+			list_move(&req->queuelist, &next->queuelist);
+			req->fifo_time = next->fifo_time;
+		}
+	}
+
+	/*
+	 * kill knowledge of next, this one is a goner
+	 */
+	zoned_remove_request(q, next);
+}
+
+/*
+ * Get a request target zone number.
+ */
+static inline unsigned int zoned_request_zone_no(struct zoned_data *zd,
+						 struct request *rq)
+{
+	return blk_rq_pos(rq) >> zd->zone_sectors_shift;
+}
+
+/*
+ * Return true if a request is a write requests that needs zone
+ * write locking.
+ */
+static inline bool zoned_request_needs_wlock(struct zoned_data *zd,
+					     struct request *rq)
+{
+	unsigned int zno = zoned_request_zone_no(zd, rq);
+	struct request_queue *q = rq->q;
+
+	if (blk_rq_is_passthrough(rq))
+		return false;
+
+	if (q->zoned.seq_zones && !test_bit(zno, q->zoned.seq_zones))
+		return false;
+
+	switch (req_op(rq)) {
+	case REQ_OP_WRITE_ZEROES:
+	case REQ_OP_WRITE_SAME:
+	case REQ_OP_WRITE:
+		return true;
+	default:
+		return false;
+	}
+}
+
+/*
+ * Abuse the elv.priv[0] pointer to indicate if a request
+ * has locked its target zone.
+ */
+#define RQ_LOCKED_ZONE		((void *)1UL)
+static inline void zoned_set_request_lock(struct request *rq)
+{
+	rq->elv.priv[0] = RQ_LOCKED_ZONE;
+}
+
+#define RQ_ZONE_NO_LOCK		((void *)0UL)
+static inline void zoned_clear_request_lock(struct request *rq)
+{
+	rq->elv.priv[0] = RQ_ZONE_NO_LOCK;
+}
+
+static inline bool zoned_request_has_lock(struct request *rq)
+{
+	return rq->elv.priv[0] == RQ_LOCKED_ZONE;
+}
+
+/*
+ * Write lock the target zone of a write request.
+ */
+static void zoned_wlock_request_zone(struct zoned_data *zd, struct request *rq)
+{
+	unsigned int zno = zoned_request_zone_no(zd, rq);
+
+	WARN_ON_ONCE(zoned_request_has_lock(rq));
+	WARN_ON_ONCE(test_and_set_bit(zno, zd->zones_wlock));
+	zoned_set_request_lock(rq);
+}
+
+/*
+ * Write unlock the target zone of a write request.
+ */
+static void zoned_wunlock_request_zone(struct zoned_data *zd,
+				       struct request *rq)
+{
+	unsigned int zno = zoned_request_zone_no(zd, rq);
+	unsigned long flags;
+
+	/*
+	 * Dispatch may be running on a different CPU.
+	 * So do not unlock the zone until it is done or
+	 * a write request in the middle of a sequence may end up
+	 * being dispatched.
+	 */
+	spin_lock_irqsave(&zd->zones_lock, flags);
+
+	WARN_ON_ONCE(!test_and_clear_bit(zno, zd->zones_wlock));
+	zoned_clear_request_lock(rq);
+
+	spin_unlock_irqrestore(&zd->zones_lock, flags);
+}
+
+/*
+ * Test the write lock state of the target zone of a write request.
+ */
+static inline bool zoned_request_zone_is_wlocked(struct zoned_data *zd,
+						 struct request *rq)
+{
+	unsigned int zno = zoned_request_zone_no(zd, rq);
+
+	return test_bit(zno, zd->zones_wlock);
+}
+
+/*
+ * move an entry to dispatch queue
+ */
+static void zoned_move_request(struct zoned_data *zd, struct request *rq)
+{
+	const int data_dir = rq_data_dir(rq);
+
+	zd->next_rq[READ] = NULL;
+	zd->next_rq[WRITE] = NULL;
+	zd->next_rq[data_dir] = zoned_latter_request(rq);
+
+	/*
+	 * take it off the sort and fifo list
+	 */
+	zoned_remove_request(rq->q, rq);
+}
+
+/*
+ * zoned_check_fifo returns 0 if there are no expired requests on the fifo,
+ * 1 otherwise. Requires !list_empty(&zd->fifo_list[data_dir])
+ */
+static inline int zoned_check_fifo(struct zoned_data *zd, int ddir)
+{
+	struct request *rq = rq_entry_fifo(zd->fifo_list[ddir].next);
+
+	/*
+	 * rq is expired!
+	 */
+	if (time_after_eq(jiffies, (unsigned long)rq->fifo_time))
+		return 1;
+
+	return 0;
+}
+
+/*
+ * Test if a request can be dispatched.
+ */
+static inline bool zoned_can_dispatch_request(struct zoned_data *zd,
+					      struct request *rq)
+{
+	return !zoned_request_needs_wlock(zd, rq) ||
+		!zoned_request_zone_is_wlocked(zd, rq);
+}
+
+/*
+ * For the specified data direction, find the next request that can be
+ * dispatched. Search in increasing sector position.
+ */
+static struct request *
+zoned_next_request(struct zoned_data *zd, int data_dir)
+{
+	struct request *rq = zd->next_rq[data_dir];
+	unsigned long flags;
+
+	if (data_dir == READ)
+		return rq;
+
+	spin_lock_irqsave(&zd->zones_lock, flags);
+	while (rq) {
+		if (zoned_can_dispatch_request(zd, rq))
+			break;
+		rq = zoned_latter_request(rq);
+	}
+	spin_unlock_irqrestore(&zd->zones_lock, flags);
+
+	return rq;
+}
+
+/*
+ * For the specified data direction, find the next request that can be
+ * dispatched. Search in arrival order from the oldest request.
+ */
+static struct request *
+zoned_fifo_request(struct zoned_data *zd, int data_dir)
+{
+	struct request *rq;
+	unsigned long flags;
+
+	if (list_empty(&zd->fifo_list[data_dir]))
+		return NULL;
+
+	if (data_dir == READ)
+		return rq_entry_fifo(zd->fifo_list[READ].next);
+
+	spin_lock_irqsave(&zd->zones_lock, flags);
+
+	list_for_each_entry(rq, &zd->fifo_list[WRITE], queuelist) {
+		if (zoned_can_dispatch_request(zd, rq))
+			goto out;
+	}
+	rq = NULL;
+
+out:
+	spin_unlock_irqrestore(&zd->zones_lock, flags);
+
+	return rq;
+}
+
+/*
+ * __zd_dispatch_request selects the best request according to
+ * read/write batch expiration, fifo_batch, target zone lock state, etc
+ */
+static struct request *__zd_dispatch_request(struct blk_mq_hw_ctx *hctx)
+{
+	struct zoned_data *zd = hctx->queue->elevator->elevator_data;
+	struct request *rq = NULL, *next_rq;
+	bool reads, writes;
+	int data_dir;
+
+	if (!list_empty(&zd->dispatch)) {
+		rq = list_first_entry(&zd->dispatch, struct request, queuelist);
+		list_del_init(&rq->queuelist);
+		goto done;
+	}
+
+	reads = !list_empty(&zd->fifo_list[READ]);
+	writes = !list_empty(&zd->fifo_list[WRITE]);
+
+	/*
+	 * batches are currently reads XOR writes
+	 */
+	rq = zoned_next_request(zd, WRITE);
+	if (!rq)
+		rq = zoned_next_request(zd, READ);
+	if (rq && zd->batching < zd->fifo_batch)
+		/* we have a next request are still entitled to batch */
+		goto dispatch_request;
+
+	/*
+	 * at this point we are not running a batch. select the appropriate
+	 * data direction (read / write)
+	 */
+
+	if (reads) {
+		if (writes && (zd->starved++ >= zd->writes_starved))
+			goto dispatch_writes;
+
+		data_dir = READ;
+
+		goto dispatch_find_request;
+	}
+
+	/*
+	 * there are either no reads or writes have been starved
+	 */
+
+	if (writes) {
+dispatch_writes:
+		zd->starved = 0;
+
+		/* Really select writes if at least one can be dispatched */
+		if (zoned_fifo_request(zd, WRITE))
+			data_dir = WRITE;
+		else
+			data_dir = READ;
+
+		goto dispatch_find_request;
+	}
+
+	return NULL;
+
+dispatch_find_request:
+	/*
+	 * we are not running a batch, find best request for selected data_dir
+	 */
+	next_rq = zoned_next_request(zd, data_dir);
+	if (zoned_check_fifo(zd, data_dir) || !next_rq) {
+		/*
+		 * A deadline has expired, the last request was in the other
+		 * direction, or we have run out of higher-sectored requests.
+		 * Start again from the request with the earliest expiry time.
+		 */
+		rq = zoned_fifo_request(zd, data_dir);
+	} else {
+		/*
+		 * The last req was the same dir and we have a next request in
+		 * sort order. No expired requests so continue on from here.
+		 */
+		rq = next_rq;
+	}
+
+	if (!rq)
+		return NULL;
+
+	zd->batching = 0;
+
+dispatch_request:
+	/*
+	 * rq is the selected appropriate request.
+	 */
+	zd->batching++;
+	zoned_move_request(zd, rq);
+
+done:
+	/*
+	 * If the request needs its target zone locked, do it.
+	 */
+	if (zoned_request_needs_wlock(zd, rq))
+		zoned_wlock_request_zone(zd, rq);
+	rq->rq_flags |= RQF_STARTED;
+	return rq;
+}
+
+static struct request *zd_dispatch_request(struct blk_mq_hw_ctx *hctx)
+{
+	struct zoned_data *zd = hctx->queue->elevator->elevator_data;
+	struct request *rq;
+
+	spin_lock(&zd->lock);
+	rq = __zd_dispatch_request(hctx);
+	spin_unlock(&zd->lock);
+
+	return rq;
+}
+
+static int zd_request_merge(struct request_queue *q, struct request **rq,
+			    struct bio *bio)
+{
+	struct zoned_data *zd = q->elevator->elevator_data;
+	sector_t sector = bio_end_sector(bio);
+	struct request *__rq;
+
+	if (!zd->front_merges)
+		return ELEVATOR_NO_MERGE;
+
+	__rq = elv_rb_find(&zd->sort_list[bio_data_dir(bio)], sector);
+	if (__rq) {
+		if (WARN_ON(sector != blk_rq_pos(__rq)))
+			return ELEVATOR_NO_MERGE;
+
+		if (elv_bio_merge_ok(__rq, bio)) {
+			*rq = __rq;
+			return ELEVATOR_FRONT_MERGE;
+		}
+	}
+
+	return ELEVATOR_NO_MERGE;
+}
+
+static bool zd_bio_merge(struct blk_mq_hw_ctx *hctx, struct bio *bio)
+{
+	struct request_queue *q = hctx->queue;
+	struct zoned_data *zd = q->elevator->elevator_data;
+	struct request *free = NULL;
+	bool ret;
+
+	spin_lock(&zd->lock);
+	ret = blk_mq_sched_try_merge(q, bio, &free);
+	spin_unlock(&zd->lock);
+
+	if (free)
+		blk_mq_free_request(free);
+
+	return ret;
+}
+
+/*
+ * add rq to rbtree and fifo
+ */
+static void __zd_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq,
+				bool at_head)
+{
+	struct request_queue *q = hctx->queue;
+	struct zoned_data *zd = q->elevator->elevator_data;
+	const int data_dir = rq_data_dir(rq);
+
+	if (blk_mq_sched_try_insert_merge(q, rq))
+		return;
+
+	blk_mq_sched_request_inserted(rq);
+
+	if (at_head || blk_rq_is_passthrough(rq)) {
+		if (at_head)
+			list_add(&rq->queuelist, &zd->dispatch);
+		else
+			list_add_tail(&rq->queuelist, &zd->dispatch);
+	} else {
+		zoned_add_rq_rb(zd, rq);
+
+		if (rq_mergeable(rq)) {
+			elv_rqhash_add(q, rq);
+			if (!q->last_merge)
+				q->last_merge = rq;
+		}
+
+		/*
+		 * set expire time and add to fifo list
+		 */
+		rq->fifo_time = jiffies + zd->fifo_expire[data_dir];
+		list_add_tail(&rq->queuelist, &zd->fifo_list[data_dir]);
+	}
+}
+
+static void zd_insert_requests(struct blk_mq_hw_ctx *hctx,
+			       struct list_head *list, bool at_head)
+{
+	struct request_queue *q = hctx->queue;
+	struct zoned_data *zd = q->elevator->elevator_data;
+
+	spin_lock(&zd->lock);
+	while (!list_empty(list)) {
+		struct request *rq;
+
+		rq = list_first_entry(list, struct request, queuelist);
+		list_del_init(&rq->queuelist);
+
+		/*
+		 * This may be a requeue of a request that has locked its
+		 * target zone. If this is the case, release the zone lock.
+		 */
+		if (zoned_request_has_lock(rq))
+			zoned_wunlock_request_zone(zd, rq);
+
+		__zd_insert_request(hctx, rq, at_head);
+	}
+	spin_unlock(&zd->lock);
+}
+
+/*
+ * Write unlock the target zone of a completed write request.
+ */
+static void zd_completed_request(struct request *rq)
+{
+
+	if (zoned_request_has_lock(rq)) {
+		struct zoned_data *zd = rq->q->elevator->elevator_data;
+
+		zoned_wunlock_request_zone(zd, rq);
+	}
+}
+
+static bool zd_has_work(struct blk_mq_hw_ctx *hctx)
+{
+	struct zoned_data *zd = hctx->queue->elevator->elevator_data;
+
+	return !list_empty_careful(&zd->dispatch) ||
+		!list_empty_careful(&zd->fifo_list[0]) ||
+		!list_empty_careful(&zd->fifo_list[1]);
+}
+
+/*
+ * Initialize elevator private data (zoned_data).
+ */
+static int zd_init_queue(struct request_queue *q, struct elevator_type *e)
+{
+	struct elevator_queue *eq;
+	struct zoned_data *zd;
+	int ret;
+
+	if (!blk_queue_is_zoned(q)) {
+		pr_err("zoned: Not a zoned block device\n");
+		return -ENODEV;
+	}
+
+	if (!blk_queue_nr_zones(q)) {
+		pr_err("zoned: Invalid zoned block device\n");
+		return -ENODEV;
+	}
+
+	eq = elevator_alloc(q, e);
+	if (!eq)
+		return -ENOMEM;
+
+	zd = kzalloc_node(sizeof(*zd), GFP_KERNEL, q->node);
+	if (!zd) {
+		kobject_put(&eq->kobj);
+		return -ENOMEM;
+	}
+
+	INIT_LIST_HEAD(&zd->fifo_list[READ]);
+	INIT_LIST_HEAD(&zd->fifo_list[WRITE]);
+	zd->sort_list[READ] = RB_ROOT;
+	zd->sort_list[WRITE] = RB_ROOT;
+	zd->fifo_expire[READ] = read_expire;
+	zd->fifo_expire[WRITE] = write_expire;
+	zd->writes_starved = writes_starved;
+	zd->front_merges = 1;
+	zd->fifo_batch = fifo_batch;
+	spin_lock_init(&zd->lock);
+	INIT_LIST_HEAD(&zd->dispatch);
+
+	zd->zone_sectors_shift = ilog2(blk_queue_zone_sectors(q));
+	spin_lock_init(&zd->zones_lock);
+
+	zd->zones_wlock = kzalloc_node(BITS_TO_LONGS(blk_queue_nr_zones(q))
+				       * sizeof(unsigned long),
+				       GFP_KERNEL, q->node);
+	if (!zd->zones_wlock) {
+		ret = -ENOMEM;
+		goto err;
+	}
+
+	eq->elevator_data = zd;
+	q->elevator = eq;
+
+	return 0;
+
+err:
+	kfree(zd);
+	kobject_put(&eq->kobj);
+
+	return ret;
+}
+
+static void zd_exit_queue(struct elevator_queue *e)
+{
+	struct zoned_data *zd = e->elevator_data;
+
+	WARN_ON(!list_empty(&zd->fifo_list[READ]));
+	WARN_ON(!list_empty(&zd->fifo_list[WRITE]));
+
+	kfree(zd);
+}
+
+/*
+ * sysfs parts below
+ */
+static ssize_t
+zoned_var_show(int var, char *page)
+{
+	return sprintf(page, "%d\n", var);
+}
+
+static ssize_t
+zoned_var_store(int *var, const char *page, size_t count)
+{
+	char *p = (char *) page;
+	int ret;
+
+	ret = kstrtoint(p, 10, var);
+	if (ret)
+		return ret;
+
+	return count;
+}
+
+#define SHOW_FUNCTION(__FUNC, __VAR, __CONV)				\
+static ssize_t __FUNC(struct elevator_queue *e, char *page)		\
+{									\
+	struct zoned_data *zd = e->elevator_data;			\
+	int __data = __VAR;						\
+	if (__CONV)							\
+		__data = jiffies_to_msecs(__data);			\
+	return zoned_var_show(__data, (page));			\
+}
+SHOW_FUNCTION(zoned_read_expire_show, zd->fifo_expire[READ], 1);
+SHOW_FUNCTION(zoned_write_expire_show, zd->fifo_expire[WRITE], 1);
+SHOW_FUNCTION(zoned_writes_starved_show, zd->writes_starved, 0);
+SHOW_FUNCTION(zoned_front_merges_show, zd->front_merges, 0);
+SHOW_FUNCTION(zoned_fifo_batch_show, zd->fifo_batch, 0);
+#undef SHOW_FUNCTION
+
+#define STORE_FUNCTION(__FUNC, __PTR, MIN, MAX, __CONV)			\
+static ssize_t __FUNC(struct elevator_queue *e,				\
+		      const char *page, size_t count)			\
+{									\
+	struct zoned_data *zd = e->elevator_data;			\
+	int __data;							\
+	int ret = zoned_var_store(&__data, (page), count);		\
+	if (__data < (MIN))						\
+		__data = (MIN);						\
+	else if (__data > (MAX))					\
+		__data = (MAX);						\
+	if (__CONV)							\
+		*(__PTR) = msecs_to_jiffies(__data);			\
+	else								\
+		*(__PTR) = __data;					\
+	return ret;							\
+}
+STORE_FUNCTION(zoned_read_expire_store, &zd->fifo_expire[READ],
+	       0, INT_MAX, 1);
+STORE_FUNCTION(zoned_write_expire_store, &zd->fifo_expire[WRITE],
+	       0, INT_MAX, 1);
+STORE_FUNCTION(zoned_writes_starved_store, &zd->writes_starved,
+	       INT_MIN, INT_MAX, 0);
+STORE_FUNCTION(zoned_front_merges_store, &zd->front_merges,
+	       0, 1, 0);
+STORE_FUNCTION(zoned_fifo_batch_store, &zd->fifo_batch,
+	       0, INT_MAX, 0);
+#undef STORE_FUNCTION
+
+#define DD_ATTR(name) \
+	__ATTR(name, S_IRUGO|S_IWUSR, zoned_##name##_show, \
+				      zoned_##name##_store)
+
+static struct elv_fs_entry zoned_attrs[] = {
+	DD_ATTR(read_expire),
+	DD_ATTR(write_expire),
+	DD_ATTR(writes_starved),
+	DD_ATTR(front_merges),
+	DD_ATTR(fifo_batch),
+	__ATTR_NULL
+};
+
+#ifdef CONFIG_BLK_DEBUG_FS
+#define ZONED_DEBUGFS_DDIR_ATTRS(ddir, name)				\
+static void *zoned_##name##_fifo_start(struct seq_file *m,		\
+					  loff_t *pos)			\
+	__acquires(&zd->lock)						\
+{									\
+	struct request_queue *q = m->private;				\
+	struct zoned_data *zd = q->elevator->elevator_data;		\
+									\
+	spin_lock(&zd->lock);						\
+	return seq_list_start(&zd->fifo_list[ddir], *pos);		\
+}									\
+									\
+static void *zoned_##name##_fifo_next(struct seq_file *m, void *v,	\
+					 loff_t *pos)			\
+{									\
+	struct request_queue *q = m->private;				\
+	struct zoned_data *zd = q->elevator->elevator_data;		\
+									\
+	return seq_list_next(v, &zd->fifo_list[ddir], pos);		\
+}									\
+									\
+static void zoned_##name##_fifo_stop(struct seq_file *m, void *v)	\
+	__releases(&zd->lock)						\
+{									\
+	struct request_queue *q = m->private;				\
+	struct zoned_data *zd = q->elevator->elevator_data;		\
+									\
+	spin_unlock(&zd->lock);						\
+}									\
+									\
+static const struct seq_operations zoned_##name##_fifo_seq_ops = {	\
+	.start	= zoned_##name##_fifo_start,				\
+	.next	= zoned_##name##_fifo_next,				\
+	.stop	= zoned_##name##_fifo_stop,				\
+	.show	= blk_mq_debugfs_rq_show,				\
+};									\
+									\
+static int zoned_##name##_next_rq_show(void *data,			\
+					  struct seq_file *m)		\
+{									\
+	struct request_queue *q = data;					\
+	struct zoned_data *zd = q->elevator->elevator_data;		\
+	struct request *rq = zd->next_rq[ddir];				\
+									\
+	if (rq)								\
+		__blk_mq_debugfs_rq_show(m, rq);			\
+	return 0;							\
+}
+ZONED_DEBUGFS_DDIR_ATTRS(READ, read)
+ZONED_DEBUGFS_DDIR_ATTRS(WRITE, write)
+#undef ZONED_DEBUGFS_DDIR_ATTRS
+
+static int zoned_batching_show(void *data, struct seq_file *m)
+{
+	struct request_queue *q = data;
+	struct zoned_data *zd = q->elevator->elevator_data;
+
+	seq_printf(m, "%u\n", zd->batching);
+	return 0;
+}
+
+static int zoned_starved_show(void *data, struct seq_file *m)
+{
+	struct request_queue *q = data;
+	struct zoned_data *zd = q->elevator->elevator_data;
+
+	seq_printf(m, "%u\n", zd->starved);
+	return 0;
+}
+
+static void *zoned_dispatch_start(struct seq_file *m, loff_t *pos)
+	__acquires(&zd->lock)
+{
+	struct request_queue *q = m->private;
+	struct zoned_data *zd = q->elevator->elevator_data;
+
+	spin_lock(&zd->lock);
+	return seq_list_start(&zd->dispatch, *pos);
+}
+
+static void *zoned_dispatch_next(struct seq_file *m, void *v, loff_t *pos)
+{
+	struct request_queue *q = m->private;
+	struct zoned_data *zd = q->elevator->elevator_data;
+
+	return seq_list_next(v, &zd->dispatch, pos);
+}
+
+static void zoned_dispatch_stop(struct seq_file *m, void *v)
+	__releases(&zd->lock)
+{
+	struct request_queue *q = m->private;
+	struct zoned_data *zd = q->elevator->elevator_data;
+
+	spin_unlock(&zd->lock);
+}
+
+static const struct seq_operations zoned_dispatch_seq_ops = {
+	.start	= zoned_dispatch_start,
+	.next	= zoned_dispatch_next,
+	.stop	= zoned_dispatch_stop,
+	.show	= blk_mq_debugfs_rq_show,
+};
+
+#define ZONED_QUEUE_DDIR_ATTRS(name)					     \
+	{#name "_fifo_list", 0400, .seq_ops = &zoned_##name##_fifo_seq_ops}, \
+	{#name "_next_rq", 0400, zoned_##name##_next_rq_show}
+static const struct blk_mq_debugfs_attr zoned_queue_debugfs_attrs[] = {
+	ZONED_QUEUE_DDIR_ATTRS(read),
+	ZONED_QUEUE_DDIR_ATTRS(write),
+	{"batching", 0400, zoned_batching_show},
+	{"starved", 0400, zoned_starved_show},
+	{"dispatch", 0400, .seq_ops = &zoned_dispatch_seq_ops},
+	{},
+};
+#undef ZONED_QUEUE_DDIR_ATTRS
+#endif
+
+static struct elevator_type zoned_elv = {
+	.ops.mq = {
+		.insert_requests	= zd_insert_requests,
+		.dispatch_request	= zd_dispatch_request,
+		.completed_request	= zd_completed_request,
+		.next_request		= elv_rb_latter_request,
+		.former_request		= elv_rb_former_request,
+		.bio_merge		= zd_bio_merge,
+		.request_merge		= zd_request_merge,
+		.requests_merged	= zd_merged_requests,
+		.request_merged		= zd_request_merged,
+		.has_work		= zd_has_work,
+		.init_sched		= zd_init_queue,
+		.exit_sched		= zd_exit_queue,
+	},
+
+	.uses_mq	= true,
+#ifdef CONFIG_BLK_DEBUG_FS
+	.queue_debugfs_attrs = zoned_queue_debugfs_attrs,
+#endif
+	.elevator_attrs = zoned_attrs,
+	.elevator_name = "zoned",
+	.elevator_owner = THIS_MODULE,
+};
+
+static int __init zoned_init(void)
+{
+	return elv_register(&zoned_elv);
+}
+
+static void __exit zoned_exit(void)
+{
+	elv_unregister(&zoned_elv);
+}
+
+module_init(zoned_init);
+module_exit(zoned_exit);
+
+MODULE_AUTHOR("Damien Le Moal");
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("Zoned MQ deadline IO scheduler");
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH V3 05/12] scsi: sd_zbc: Fix comments and indentation
  2017-09-15 10:06 ` [PATCH V3 05/12] scsi: sd_zbc: Fix comments and indentation Damien Le Moal
@ 2017-09-15 10:44   ` Hannes Reinecke
  2017-09-15 22:48     ` Damien Le Moal
  0 siblings, 1 reply; 48+ messages in thread
From: Hannes Reinecke @ 2017-09-15 10:44 UTC (permalink / raw)
  To: Damien Le Moal, linux-scsi, Martin K . Petersen, linux-block, Jens Axboe
  Cc: Christoph Hellwig, Bart Van Assche

On 09/15/2017 12:06 PM, Damien Le Moal wrote:
> Fix comments style (do not use documented comment style) and add some
> comments to clarify some functions. Also fix some functions signature
> indentation and remove a useless blank line in sd_zbc_read_zones().
> 
> No functional change is introduced by this patch.
> 
> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
> ---
>  drivers/scsi/scsi_lib.c |  5 +++-
>  drivers/scsi/sd_zbc.c   | 67 ++++++++++++++++++++++++++++++++++++-------------
>  2 files changed, 53 insertions(+), 19 deletions(-)
> 
Actually you should've used kernel-doc style to format the comments;
just to keep the 0-day bot happy....

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V3 06/12] scsi: sd_zbc: Rearrange code
  2017-09-15 10:06 ` [PATCH V3 06/12] scsi: sd_zbc: Rearrange code Damien Le Moal
@ 2017-09-15 10:45   ` Hannes Reinecke
  2017-09-15 14:51     ` Bart Van Assche
  2017-09-15 17:49   ` Christoph Hellwig
  2 siblings, 0 replies; 48+ messages in thread
From: Hannes Reinecke @ 2017-09-15 10:45 UTC (permalink / raw)
  To: Damien Le Moal, linux-scsi, Martin K . Petersen, linux-block, Jens Axboe
  Cc: Christoph Hellwig, Bart Van Assche

On 09/15/2017 12:06 PM, Damien Le Moal wrote:
> Rearrange sd_zbc_setup() to include use_16_for_rw and use_10_for_rw
> assignments and move the calculation of sdkp->zone_shift together
> with the assignment of the verified zone_blocks value in
> sd_zbc_check_zone_size().
> 
> No functional change is introduced by this patch.
> 
> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
> ---
>  drivers/scsi/sd_zbc.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V3 07/12] scsi: sd_zbc: Use well defined macros
  2017-09-15 10:06 ` [PATCH V3 07/12] scsi: sd_zbc: Use well defined macros Damien Le Moal
@ 2017-09-15 10:45   ` Hannes Reinecke
  2017-09-15 16:27     ` Bart Van Assche
  2017-09-15 17:51   ` Christoph Hellwig
  2 siblings, 0 replies; 48+ messages in thread
From: Hannes Reinecke @ 2017-09-15 10:45 UTC (permalink / raw)
  To: Damien Le Moal, linux-scsi, Martin K . Petersen, linux-block, Jens Axboe
  Cc: Christoph Hellwig, Bart Van Assche

On 09/15/2017 12:06 PM, Damien Le Moal wrote:
> instead of open coding, use the min() macro to calculate a report zones
> reply buffer length in sd_zbc_check_zone_size() and the round_up()
> macro for calculating the number of zones in sd_zbc_setup().
> 
> No functional change is introduced by this patch.
> 
> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
> ---
>  drivers/scsi/sd_zbc.c | 12 ++++--------
>  1 file changed, 4 insertions(+), 8 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V3 08/12] scsi: sd_zbc: Fix sd_zbc_read_zoned_characteristics()
  2017-09-15 10:06 ` [PATCH V3 08/12] scsi: sd_zbc: Fix sd_zbc_read_zoned_characteristics() Damien Le Moal
@ 2017-09-15 10:46   ` Hannes Reinecke
  2017-09-15 17:52   ` Christoph Hellwig
  1 sibling, 0 replies; 48+ messages in thread
From: Hannes Reinecke @ 2017-09-15 10:46 UTC (permalink / raw)
  To: Damien Le Moal, linux-scsi, Martin K . Petersen, linux-block, Jens Axboe
  Cc: Christoph Hellwig, Bart Van Assche, stable

On 09/15/2017 12:06 PM, Damien Le Moal wrote:
> The three values starting at byte 8 of the Zoned Block Device
> Characteristics VPD page B6h are 32 bits values, not 64bits. So use
> get_unaligned_be32() to retrieve the values and not get_unaligned_be64()
> 
> Fixes: 89d947561077 ("sd: Implement support for ZBC devices")
> Cc: <stable@vger.kernel.org>
> 
> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
> Reviewed-by: Bart Van Assche <Bart.VanAssche@wdc.com>
> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
> ---
>  drivers/scsi/sd_zbc.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
> index 37b33adbb2ef..7e30d0c22930 100644
> --- a/drivers/scsi/sd_zbc.c
> +++ b/drivers/scsi/sd_zbc.c
> @@ -385,15 +385,15 @@ static int sd_zbc_read_zoned_characteristics(struct scsi_disk *sdkp,
>  	if (sdkp->device->type != TYPE_ZBC) {
>  		/* Host-aware */
>  		sdkp->urswrz = 1;
> -		sdkp->zones_optimal_open = get_unaligned_be64(&buf[8]);
> -		sdkp->zones_optimal_nonseq = get_unaligned_be64(&buf[12]);
> +		sdkp->zones_optimal_open = get_unaligned_be32(&buf[8]);
> +		sdkp->zones_optimal_nonseq = get_unaligned_be32(&buf[12]);
>  		sdkp->zones_max_open = 0;
>  	} else {
>  		/* Host-managed */
>  		sdkp->urswrz = buf[4] & 1;
>  		sdkp->zones_optimal_open = 0;
>  		sdkp->zones_optimal_nonseq = 0;
> -		sdkp->zones_max_open = get_unaligned_be64(&buf[16]);
> +		sdkp->zones_max_open = get_unaligned_be32(&buf[16]);
>  	}
>  
>  	return 0;
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V3 09/12] scsi: sd_zbc: Initialize device queue zoned structure
  2017-09-15 10:06 ` [PATCH V3 09/12] scsi: sd_zbc: Initialize device queue zoned structure Damien Le Moal
@ 2017-09-15 10:49   ` Hannes Reinecke
  0 siblings, 0 replies; 48+ messages in thread
From: Hannes Reinecke @ 2017-09-15 10:49 UTC (permalink / raw)
  To: Damien Le Moal, linux-scsi, Martin K . Petersen, linux-block, Jens Axboe
  Cc: Christoph Hellwig, Bart Van Assche

On 09/15/2017 12:06 PM, Damien Le Moal wrote:
> Allocate and initialize the disk request queue zoned structure on disk
> revalidate. As the bitmap allocation for the seq_zones field of the
> zoned structure is identical to the allocation of the zones write lock
> bitmap, introduce the helper sd_zbc_alloc_zone_bitmap().
> Using this helper, wait for the disk capacity and number of zones to
> stabilize on the second revalidation pass to allocate and initialize
> the bitmaps.
> 
> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
> ---
>  drivers/scsi/sd_zbc.c | 119 +++++++++++++++++++++++++++++++++++++++++++++++---
>  1 file changed, 114 insertions(+), 5 deletions(-)
> 
I would've been slightly worried about locking; these things tend to be
reset during revalidate, which happens quite frequently for ATA drives.
You sure you don't need it here?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V3 10/12] scsi: sd_zbc: Limit zone write locking to sequential zones
  2017-09-15 10:06 ` [PATCH V3 10/12] scsi: sd_zbc: Limit zone write locking to sequential zones Damien Le Moal
@ 2017-09-15 10:52   ` Hannes Reinecke
  2017-09-15 16:23     ` Bart Van Assche
  1 sibling, 0 replies; 48+ messages in thread
From: Hannes Reinecke @ 2017-09-15 10:52 UTC (permalink / raw)
  To: Damien Le Moal, linux-scsi, Martin K . Petersen, linux-block, Jens Axboe
  Cc: Christoph Hellwig, Bart Van Assche

On 09/15/2017 12:06 PM, Damien Le Moal wrote:
> Zoned block devices have no write constraints for conventional zones.
> So write locking of conventional zones is not necessary and can even
> hurt performance by unnecessarily operating the disk under low queue
> depth. To avoid this, use the disk request queue seq_zones bitmap to
> allow any write to be issued to conventional zones, locking only
> sequential zones.
> 
> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
> ---
>  drivers/scsi/sd_zbc.c | 20 ++++++++++++--------
>  1 file changed, 12 insertions(+), 8 deletions(-)
> 
Reviwed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V3 11/12] scsi: sd_zbc: Disable zone write locking with scsi-mq
  2017-09-15 10:06 ` [PATCH V3 11/12] scsi: sd_zbc: Disable zone write locking with scsi-mq Damien Le Moal
@ 2017-09-15 10:52   ` Hannes Reinecke
  0 siblings, 0 replies; 48+ messages in thread
From: Hannes Reinecke @ 2017-09-15 10:52 UTC (permalink / raw)
  To: Damien Le Moal, linux-scsi, Martin K . Petersen, linux-block, Jens Axboe
  Cc: Christoph Hellwig, Bart Van Assche

On 09/15/2017 12:06 PM, Damien Le Moal wrote:
> In the case of a ZBC disk used with scsi-mq, zone write locking does
> not prevent write reordering in sequential zones. Unlike the legacy
> case, zone locking is done after the command request is removed from
> the scheduler dispatch queue. That is, at the time of zone locking,
> the write command may already be out of order, making locking
> ineffective. Write order guarantees can only be provided by an
> adapted I/O scheduler.
> 
> Disable zone write locking in sd_zbc_write_lock_zone() if the disk is
> used with scsi-mq. As the disk zones_wlock bitmap is not necessry,
> do not allocate it.
> 
> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
> Reviewed-by: Bart Van Assche <Bart.VanAssche@wdc.com>
> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
> ---
>  drivers/scsi/sd_zbc.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V3 12/12] block: Introduce zoned I/O scheduler
  2017-09-15 10:06 ` [PATCH V3 12/12] block: Introduce zoned I/O scheduler Damien Le Moal
@ 2017-09-15 10:55   ` Hannes Reinecke
  0 siblings, 0 replies; 48+ messages in thread
From: Hannes Reinecke @ 2017-09-15 10:55 UTC (permalink / raw)
  To: Damien Le Moal, linux-scsi, Martin K . Petersen, linux-block, Jens Axboe
  Cc: Christoph Hellwig, Bart Van Assche

On 09/15/2017 12:06 PM, Damien Le Moal wrote:
> The zoned I/O scheduler is mostly identical to mq-deadline and retains
> the same configuration attributes. The main difference is that the
> zoned scheduler will ensure that at any time at most one write request
> per sequential zone is in flight (has been dispatched to the disk) in
> order to protect against potential sequential write reordering
> resulting from the concurrent execution of request dispatch by multiple
> contexts.
> 
> This is achieved similarly to the legacy SCSI I/O path by write locking
> zones when a write requests is issued. Subsequent writes to the same
> zone have to wait for the completion of the previously issued write
> before being in turn dispatched to the disk. This ensures that
> sequential writes are processed in the correct order without needing
> any modification to the execution model of blk-mq. In addition, this
> also protects against write reordering at the HBA level (e.g. AHCI).
> 
> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
> ---
>  Documentation/block/zoned-iosched.txt |  48 ++
>  block/Kconfig.iosched                 |  12 +
>  block/Makefile                        |   1 +
>  block/zoned-iosched.c                 | 925 ++++++++++++++++++++++++++++++++++
>  4 files changed, 986 insertions(+)
>  create mode 100644 Documentation/block/zoned-iosched.txt
>  create mode 100644 block/zoned-iosched.c
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V3 01/12] block: Fix declaration of blk-mq debugfs functions
  2017-09-15 10:06 ` [PATCH V3 01/12] block: Fix declaration of blk-mq debugfs functions Damien Le Moal
@ 2017-09-15 14:33     ` Bart Van Assche
  2017-09-15 17:45   ` Christoph Hellwig
  1 sibling, 0 replies; 48+ messages in thread
From: Bart Van Assche @ 2017-09-15 14:33 UTC (permalink / raw)
  To: linux-scsi, linux-block, Damien Le Moal, martin.petersen, axboe; +Cc: hch

T24gRnJpLCAyMDE3LTA5LTE1IGF0IDE5OjA2ICswOTAwLCBEYW1pZW4gTGUgTW9hbCB3cm90ZToN
Cj4gX19ibGtfbXFfZGVidWdmc19ycV9zaG93KCkgYW5kIGJsa19tcV9kZWJ1Z2ZzX3JxX3Nob3co
KSBhcmUgZXhwb3J0ZWQNCj4gc3ltYm9scyBidXQgYXIgZW9ubHkgZGVjbGFyZWQgaW4gdGhlIGJs
b2NrIGludGVybmFsIGZpbGUNCiAgICAgICAgICAgICAgXl5eXl5eXl4NCiAgICAgICAgICAgICAg
YXJlIG9ubHk/DQo+IGJsb2NrL2Jsay1tcS1kZWJ1Z2ZzLmguIHdoaWNoIGlzIG5vdCBjbGVhbmx5
IGFjY2Vzc2libGUgdG8gZmlsZXMgb3V0c2lkZQ0KPiBvZiB0aGUgYmxvY2sgZGlyZWN0b3J5Lg0K
PiBNb3ZlIHRoZSBkZWNsYXJhdGlvbiBvZiB0aGVzZSBmdW5jdGlvbnMgdG8gdGhlIG5ldyBmaWxl
DQo+IGluY2x1ZGUvbGludXgvYmxrLW1xLWRlYnVnZnMuaCBmaWxlIHRvIG1ha2UgdGhlIGRlY2xh
cmF0aW9ucyBjbGVhbmx5DQo+IGF2YWlsYWJsZSB0byBvdGhlciBtb2R1bGVzLg0KPiANCj4gV2hp
bGUgYXQgaXQsIGFsc28gbW92ZSB0aGUgZGVmaW5pdGlvbiBvZiB0aGUgYmxrX21xX2RlYnVnZnNf
YXR0cg0KPiBzdHJ1Y3R1cmUgdG8gYWxsb3cgc2NoZWR1bGVyIG1vZHVsZXMgb3V0c2lkZSBvZiB0
aGUgYmxvY2sgZGlyZWN0b3J5IHRvDQo+IGRlZmluZSBkZWJ1Z2ZzIGF0dHJpYnV0ZXMuDQoNClRo
ZSB0aXRsZSBvZiB0aGlzIHBhdGNoIGlzICJGaXggZGVjbGFyYXRpb24gb2YgLi4uIi4gU2hvdWxk
IHRoZSB0aXRsZSBwZXJoYXBzDQpoYXZlIGJlZW4gIk1vdmUgZGVjbGFyYXRpb25zIG9mIC4uLiI/
DQoNCkFueXdheToNCg0KUmV2aWV3ZWQtYnk6IEJhcnQgVmFuIEFzc2NoZSA8YmFydC52YW5hc3Nj
aGVAd2RjLmNvbT4NCg0K

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V3 01/12] block: Fix declaration of blk-mq debugfs functions
@ 2017-09-15 14:33     ` Bart Van Assche
  0 siblings, 0 replies; 48+ messages in thread
From: Bart Van Assche @ 2017-09-15 14:33 UTC (permalink / raw)
  To: linux-scsi, linux-block, Damien Le Moal, martin.petersen, axboe; +Cc: hch

On Fri, 2017-09-15 at 19:06 +0900, Damien Le Moal wrote:
> __blk_mq_debugfs_rq_show() and blk_mq_debugfs_rq_show() are exported
> symbols but ar eonly declared in the block internal file
              ^^^^^^^^
              are only?
> block/blk-mq-debugfs.h. which is not cleanly accessible to files outside
> of the block directory.
> Move the declaration of these functions to the new file
> include/linux/blk-mq-debugfs.h file to make the declarations cleanly
> available to other modules.
> 
> While at it, also move the definition of the blk_mq_debugfs_attr
> structure to allow scheduler modules outside of the block directory to
> define debugfs attributes.

The title of this patch is "Fix declaration of ...". Should the title perhaps
have been "Move declarations of ..."?

Anyway:

Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V3 02/12] block: Fix declaration of blk-mq scheduler functions
  2017-09-15 10:06 ` [PATCH V3 02/12] block: Fix declaration of blk-mq scheduler functions Damien Le Moal
@ 2017-09-15 14:35     ` Bart Van Assche
  2017-09-15 17:46   ` Christoph Hellwig
  1 sibling, 0 replies; 48+ messages in thread
From: Bart Van Assche @ 2017-09-15 14:35 UTC (permalink / raw)
  To: linux-scsi, linux-block, Damien Le Moal, martin.petersen, axboe; +Cc: hch

T24gRnJpLCAyMDE3LTA5LTE1IGF0IDE5OjA2ICswOTAwLCBEYW1pZW4gTGUgTW9hbCB3cm90ZToN
Cj4gVGhlIGZ1bmN0aW9ucyBibGtfbXFfc2NoZWRfZnJlZV9oY3R4X2RhdGEoKSwgYmxrX21xX3Nj
aGVkX3RyeV9tZXJnZSgpLA0KPiBibGtfbXFfc2NoZWRfdHJ5X2luc2VydF9tZXJnZSgpIGFuZCBi
bGtfbXFfc2NoZWRfcmVxdWVzdF9pbnNlcnRlZCgpIGFyZQ0KPiBhbGwgZXhwb3J0ZWQgc3ltYm9s
cyBidXQgYXJlIGRlY2xhcmVkIG9ubHkgaW50ZXJuYWxseSBpbg0KPiBibG9jay9ibGstbXEtc2No
ZWQuaC4gTW92ZSB0aGVzZSBkZWNsYXJhdGlvbnMgdG8gdGhlIG5ldyBmaWxlDQo+IGluY2x1ZGUv
bGludXgvYmxrLW1xLXNjaGVkLmggdG8gbWFrZSB0aGVtIGF2YWlsYWJsZSB0byBibG9jayBzY2hl
ZHVsZXINCj4gbW9kdWxlcyBpbXBsZW1lbnRlZCBvdXRzaWRlIG9mIHRoZSBibG9jayBkaXJlY3Rv
cnkuDQoNClNhbWUgY29tbWVudCBoZXJlOiBzaG91bGQgdGhlIHRpdGxlIG9mIHRoaXMgcGF0Y2gg
cGVyaGFwcyBoYXZlIGJlZW4gIk1vdmUNCmRlY2xhcmF0aW9ucyBvZiAuLi4iPw0KDQo+ICsjaWZu
ZGVmIEJMS19NUV9TQ0hFRF9IDQo+ICsjZGVmaW5lIEJMS19NUV9TQ0hFRF9IDQo+ICsNCj4gKy8q
DQo+ICsgKiBTY2hlZHVsZXIgaGVscGVyIGZ1bmN0aW9ucy4NCj4gKyAqLw0KPiArdm9pZCBibGtf
bXFfc2NoZWRfZnJlZV9oY3R4X2RhdGEoc3RydWN0IHJlcXVlc3RfcXVldWUgKnEsDQo+ICsJCQkJ
IHZvaWQgKCpleGl0KShzdHJ1Y3QgYmxrX21xX2h3X2N0eCAqKSk7DQo+ICt2b2lkIGJsa19tcV9z
Y2hlZF9yZXF1ZXN0X2luc2VydGVkKHN0cnVjdCByZXF1ZXN0ICpycSk7DQo+ICtib29sIGJsa19t
cV9zY2hlZF90cnlfbWVyZ2Uoc3RydWN0IHJlcXVlc3RfcXVldWUgKnEsIHN0cnVjdCBiaW8gKmJp
bywNCj4gKwkJCSAgICBzdHJ1Y3QgcmVxdWVzdCAqKm1lcmdlZF9yZXF1ZXN0KTsNCj4gK2Jvb2wg
YmxrX21xX3NjaGVkX3RyeV9pbnNlcnRfbWVyZ2Uoc3RydWN0IHJlcXVlc3RfcXVldWUgKnEsIHN0
cnVjdCByZXF1ZXN0ICpycSk7DQo+ICsNCj4gKyNlbmRpZg0KDQpQbGVhc2UgbWFrZSBzdXJlIHRo
YXQgdGhlIG9yZGVyIG9mICNpbmNsdWRlIGRpcmVjdGl2ZXMgZG9lcyBub3QgYWZmZWN0IHRoZQ0K
Y29tcGlsYXRpb24gcmVzdWx0LCBlLmcuIGJ5IGFkZGluZyBmb3J3YXJkIGRlY2xhcmF0aW9ucyBm
b3IgdGhlIHN0cnVjdHVyZXMNCnVzZWQgYXMgYXJndW1lbnRzLg0KDQpUaGFua3MsDQoNCkJhcnQu

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V3 02/12] block: Fix declaration of blk-mq scheduler functions
@ 2017-09-15 14:35     ` Bart Van Assche
  0 siblings, 0 replies; 48+ messages in thread
From: Bart Van Assche @ 2017-09-15 14:35 UTC (permalink / raw)
  To: linux-scsi, linux-block, Damien Le Moal, martin.petersen, axboe; +Cc: hch

On Fri, 2017-09-15 at 19:06 +0900, Damien Le Moal wrote:
> The functions blk_mq_sched_free_hctx_data(), blk_mq_sched_try_merge(),
> blk_mq_sched_try_insert_merge() and blk_mq_sched_request_inserted() are
> all exported symbols but are declared only internally in
> block/blk-mq-sched.h. Move these declarations to the new file
> include/linux/blk-mq-sched.h to make them available to block scheduler
> modules implemented outside of the block directory.

Same comment here: should the title of this patch perhaps have been "Move
declarations of ..."?

> +#ifndef BLK_MQ_SCHED_H
> +#define BLK_MQ_SCHED_H
> +
> +/*
> + * Scheduler helper functions.
> + */
> +void blk_mq_sched_free_hctx_data(struct request_queue *q,
> +				 void (*exit)(struct blk_mq_hw_ctx *));
> +void blk_mq_sched_request_inserted(struct request *rq);
> +bool blk_mq_sched_try_merge(struct request_queue *q, struct bio *bio,
> +			    struct request **merged_request);
> +bool blk_mq_sched_try_insert_merge(struct request_queue *q, struct request *rq);
> +
> +#endif

Please make sure that the order of #include directives does not affect the
compilation result, e.g. by adding forward declarations for the structures
used as arguments.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V3 03/12] block: Add zoned block device information to request queue
  2017-09-15 10:06 ` [PATCH V3 03/12] block: Add zoned block device information to request queue Damien Le Moal
@ 2017-09-15 14:38     ` Bart Van Assche
  2017-09-15 17:48   ` Christoph Hellwig
  1 sibling, 0 replies; 48+ messages in thread
From: Bart Van Assche @ 2017-09-15 14:38 UTC (permalink / raw)
  To: linux-scsi, linux-block, Damien Le Moal, martin.petersen, axboe; +Cc: hch

T24gRnJpLCAyMDE3LTA5LTE1IGF0IDE5OjA2ICswOTAwLCBEYW1pZW4gTGUgTW9hbCB3cm90ZToN
Cj4gQEAgLTQ5Miw2ICs0OTcsMTAgQEAgc3RydWN0IHJlcXVlc3RfcXVldWUgew0KPiAgCXN0cnVj
dCBibGtfaW50ZWdyaXR5IGludGVncml0eTsNCj4gICNlbmRpZgkvKiBDT05GSUdfQkxLX0RFVl9J
TlRFR1JJVFkgKi8NCj4gIA0KPiArI2lmZGVmIENPTkZJR19CTEtfREVWX1pPTkVEDQo+ICsJc3Ry
dWN0IGJsa196b25lZAl6b25lZDsNCj4gKyNlbmRpZg0KPiArDQo+ICAjaWZkZWYgQ09ORklHX1BN
DQo+ICAJc3RydWN0IGRldmljZQkJKmRldjsNCj4gIAlpbnQJCQlycG1fc3RhdHVzOw0KPiBAQCAt
Nzg1LDYgKzc5NCwxMSBAQCBzdGF0aWMgaW5saW5lIHVuc2lnbmVkIGludCBibGtfcXVldWVfem9u
ZV9zZWN0b3JzKHN0cnVjdCByZXF1ZXN0X3F1ZXVlICpxKQ0KPiAgCXJldHVybiBibGtfcXVldWVf
aXNfem9uZWQocSkgPyBxLT5saW1pdHMuY2h1bmtfc2VjdG9ycyA6IDA7DQo+ICB9DQo+ICANCj4g
K3N0YXRpYyBpbmxpbmUgdW5zaWduZWQgaW50IGJsa19xdWV1ZV9ucl96b25lcyhzdHJ1Y3QgcmVx
dWVzdF9xdWV1ZSAqcSkNCj4gK3sNCj4gKwlyZXR1cm4gYmxrX3F1ZXVlX2lzX3pvbmVkKHEpID8g
cS0+em9uZWQubnJfem9uZXMgOiAwOw0KPiArfQ0KDQpEb2VzIHRoaXMgY29kZSBjb21waWxlIGNv
cnJlY3RseSBpZiBDT05GSUdfQkxLX0RFVl9aT05FRCBpcyBkaXNhYmxlZD8gU2hvdWxkDQp0aGUg
ZGVmaW5pdGlvbiBvZiBibGtfcXVldWVfbnJfem9uZXMoKSBwZXJoYXBzIGJlIHN1cnJvdW5kZWQg
YnkgI2lmZGVmDQpDT05GSUdfQkxLX0RFVl9aT05FRCAvICNlbmRpZj8NCg0KVGhhbmtzLA0KDQpC
YXJ0Lg==

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V3 03/12] block: Add zoned block device information to request queue
@ 2017-09-15 14:38     ` Bart Van Assche
  0 siblings, 0 replies; 48+ messages in thread
From: Bart Van Assche @ 2017-09-15 14:38 UTC (permalink / raw)
  To: linux-scsi, linux-block, Damien Le Moal, martin.petersen, axboe; +Cc: hch

On Fri, 2017-09-15 at 19:06 +0900, Damien Le Moal wrote:
> @@ -492,6 +497,10 @@ struct request_queue {
>  	struct blk_integrity integrity;
>  #endif	/* CONFIG_BLK_DEV_INTEGRITY */
>  
> +#ifdef CONFIG_BLK_DEV_ZONED
> +	struct blk_zoned	zoned;
> +#endif
> +
>  #ifdef CONFIG_PM
>  	struct device		*dev;
>  	int			rpm_status;
> @@ -785,6 +794,11 @@ static inline unsigned int blk_queue_zone_sectors(struct request_queue *q)
>  	return blk_queue_is_zoned(q) ? q->limits.chunk_sectors : 0;
>  }
>  
> +static inline unsigned int blk_queue_nr_zones(struct request_queue *q)
> +{
> +	return blk_queue_is_zoned(q) ? q->zoned.nr_zones : 0;
> +}

Does this code compile correctly if CONFIG_BLK_DEV_ZONED is disabled? Should
the definition of blk_queue_nr_zones() perhaps be surrounded by #ifdef
CONFIG_BLK_DEV_ZONED / #endif?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V3 06/12] scsi: sd_zbc: Rearrange code
  2017-09-15 10:06 ` [PATCH V3 06/12] scsi: sd_zbc: Rearrange code Damien Le Moal
@ 2017-09-15 14:51     ` Bart Van Assche
  2017-09-15 14:51     ` Bart Van Assche
  2017-09-15 17:49   ` Christoph Hellwig
  2 siblings, 0 replies; 48+ messages in thread
From: Bart Van Assche @ 2017-09-15 14:51 UTC (permalink / raw)
  To: linux-scsi, linux-block, Damien Le Moal, martin.petersen, axboe; +Cc: hch

T24gRnJpLCAyMDE3LTA5LTE1IGF0IDE5OjA2ICswOTAwLCBEYW1pZW4gTGUgTW9hbCB3cm90ZToN
Cj4gUmVhcnJhbmdlIHNkX3piY19zZXR1cCgpIHRvIGluY2x1ZGUgdXNlXzE2X2Zvcl9ydyBhbmQg
dXNlXzEwX2Zvcl9ydw0KPiBhc3NpZ25tZW50cyBhbmQgbW92ZSB0aGUgY2FsY3VsYXRpb24gb2Yg
c2RrcC0+em9uZV9zaGlmdCB0b2dldGhlcg0KPiB3aXRoIHRoZSBhc3NpZ25tZW50IG9mIHRoZSB2
ZXJpZmllZCB6b25lX2Jsb2NrcyB2YWx1ZSBpbg0KPiBzZF96YmNfY2hlY2tfem9uZV9zaXplKCku
DQoNCkJvdGggZnVuY3Rpb25zIGFyZSBjYWxsZWQgZnJvbSBpbnNpZGUgYmxvY2tfZGV2aWNlX29w
ZXJhdGlvbnMucmV2YWxpZGF0ZV9kaXNrLg0KSXNuJ3QgdGhhdCB0b28gbGF0ZSB0byBzZXQgdXNl
XzFbMDZdX2Zvcl9ydz8gU2hvdWxkbid0IGJvdGggYmUgc2V0IGp1c3QgYWZ0ZXINCnRoZSBzbGF2
ZV9hbGxvYyBjYWxsYmFjayBoYXMgYmVlbiBjYWxsZWQ/DQoNClRoYW5rcywNCg0KQmFydC4=

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V3 06/12] scsi: sd_zbc: Rearrange code
@ 2017-09-15 14:51     ` Bart Van Assche
  0 siblings, 0 replies; 48+ messages in thread
From: Bart Van Assche @ 2017-09-15 14:51 UTC (permalink / raw)
  To: linux-scsi, linux-block, Damien Le Moal, martin.petersen, axboe; +Cc: hch

On Fri, 2017-09-15 at 19:06 +0900, Damien Le Moal wrote:
> Rearrange sd_zbc_setup() to include use_16_for_rw and use_10_for_rw
> assignments and move the calculation of sdkp->zone_shift together
> with the assignment of the verified zone_blocks value in
> sd_zbc_check_zone_size().

Both functions are called from inside block_device_operations.revalidate_disk.
Isn't that too late to set use_1[06]_for_rw? Shouldn't both be set just after
the slave_alloc callback has been called?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V3 10/12] scsi: sd_zbc: Limit zone write locking to sequential zones
  2017-09-15 10:06 ` [PATCH V3 10/12] scsi: sd_zbc: Limit zone write locking to sequential zones Damien Le Moal
@ 2017-09-15 16:23     ` Bart Van Assche
  2017-09-15 16:23     ` Bart Van Assche
  1 sibling, 0 replies; 48+ messages in thread
From: Bart Van Assche @ 2017-09-15 16:23 UTC (permalink / raw)
  To: linux-scsi, linux-block, Damien Le Moal, martin.petersen, axboe; +Cc: hch

T24gRnJpLCAyMDE3LTA5LTE1IGF0IDE5OjA2ICswOTAwLCBEYW1pZW4gTGUgTW9hbCB3cm90ZToN
Cj4gKwkgKiBUaGVyZSBpcyBubyB3cml0ZSBjb25zdHJhaW50cyBvbiBjb252ZW50aW9uYWwgem9u
ZXMuIFNvIGFueSB3cml0ZQ0KICAgICAgICAgICBeXl5eXl5eXl5eXg0KU2hvdWxkIHRoaXMgaGF2
ZSBiZWVuICJUaGVyZSBhcmUgbm8iPw0KDQo+IC0JaWYgKHNka3AtPnpvbmVzX3dsb2NrICYmDQo+
IC0JICAgIHRlc3RfYW5kX3NldF9iaXQoem5vLCBzZGtwLT56b25lc193bG9jaykpDQo+ICsJaWYg
KHEtPnpvbmVkLnNlcV96b25lcyAmJiB0ZXN0X2JpdCh6bm8sIHEtPnpvbmVkLnNlcV96b25lcykp
DQo+ICsJCXJldHVybiBCTEtQUkVQX09LOw0KPiArCWlmIChzZGtwLT56b25lc193bG9jayAmJiB0
ZXN0X2FuZF9zZXRfYml0KHpubywgc2RrcC0+em9uZXNfd2xvY2spKQ0KPiAgCQlyZXR1cm4gQkxL
UFJFUF9ERUZFUjsNCg0KUGxlYXNlIGludHJvZHVjZSBoZWxwZXIgZnVuY3Rpb25zIGluIGluY2x1
ZGUvbGludXgvYmxrZGV2Lmggb3IgaW4gYSBuZXcNCmhlYWRlciBmaWxlIGZvciB2ZXJpZnlpbmcg
d2hldGhlciBvciBub3QgYSB6b25lIGlzIGEgc2VxdWVudGlhbCB6b25lIGFuZCBhbHNvDQpmb3Ig
bG9ja2luZyBhIHpvbmUgZm9yIHdyaXRlcy4gSSBleHBlY3QgdGhhdCB0aGF0IHdpbGwgbWFrZSB0
aGUgYWJvdmUgY29kZQ0Kc2lnbmlmaWNhbnRseSBlYXNpZXIgdG8gcmVhZC4NCg0KVGhhbmtzLA0K
DQpCYXJ0Lg==

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V3 10/12] scsi: sd_zbc: Limit zone write locking to sequential zones
@ 2017-09-15 16:23     ` Bart Van Assche
  0 siblings, 0 replies; 48+ messages in thread
From: Bart Van Assche @ 2017-09-15 16:23 UTC (permalink / raw)
  To: linux-scsi, linux-block, Damien Le Moal, martin.petersen, axboe; +Cc: hch

On Fri, 2017-09-15 at 19:06 +0900, Damien Le Moal wrote:
> +	 * There is no write constraints on conventional zones. So any write
           ^^^^^^^^^^^
Should this have been "There are no"?

> -	if (sdkp->zones_wlock &&
> -	    test_and_set_bit(zno, sdkp->zones_wlock))
> +	if (q->zoned.seq_zones && test_bit(zno, q->zoned.seq_zones))
> +		return BLKPREP_OK;
> +	if (sdkp->zones_wlock && test_and_set_bit(zno, sdkp->zones_wlock))
>  		return BLKPREP_DEFER;

Please introduce helper functions in include/linux/blkdev.h or in a new
header file for verifying whether or not a zone is a sequential zone and also
for locking a zone for writes. I expect that that will make the above code
significantly easier to read.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V3 07/12] scsi: sd_zbc: Use well defined macros
  2017-09-15 10:06 ` [PATCH V3 07/12] scsi: sd_zbc: Use well defined macros Damien Le Moal
@ 2017-09-15 16:27     ` Bart Van Assche
  2017-09-15 16:27     ` Bart Van Assche
  2017-09-15 17:51   ` Christoph Hellwig
  2 siblings, 0 replies; 48+ messages in thread
From: Bart Van Assche @ 2017-09-15 16:27 UTC (permalink / raw)
  To: linux-scsi, linux-block, Damien Le Moal, martin.petersen, axboe; +Cc: hch

T24gRnJpLCAyMDE3LTA5LTE1IGF0IDE5OjA2ICswOTAwLCBEYW1pZW4gTGUgTW9hbCB3cm90ZToN
Cj4gaW5zdGVhZCBvZiBvcGVuIGNvZGluZywgdXNlIHRoZSBtaW4oKSBtYWNybyB0byBjYWxjdWxh
dGUgYSByZXBvcnQgem9uZXMNCj4gcmVwbHkgYnVmZmVyIGxlbmd0aCBpbiBzZF96YmNfY2hlY2tf
em9uZV9zaXplKCkgYW5kIHRoZSByb3VuZF91cCgpDQo+IG1hY3JvIGZvciBjYWxjdWxhdGluZyB0
aGUgbnVtYmVyIG9mIHpvbmVzIGluIHNkX3piY19zZXR1cCgpLg0KDQpSZXZpZXdlZC1ieTogQmFy
dCBWYW4gQXNzY2hlIDxiYXJ0LnZhbmFzc2NoZUB3ZGMuY29tPg0KDQo=

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V3 07/12] scsi: sd_zbc: Use well defined macros
@ 2017-09-15 16:27     ` Bart Van Assche
  0 siblings, 0 replies; 48+ messages in thread
From: Bart Van Assche @ 2017-09-15 16:27 UTC (permalink / raw)
  To: linux-scsi, linux-block, Damien Le Moal, martin.petersen, axboe; +Cc: hch

On Fri, 2017-09-15 at 19:06 +0900, Damien Le Moal wrote:
> instead of open coding, use the min() macro to calculate a report zones
> reply buffer length in sd_zbc_check_zone_size() and the round_up()
> macro for calculating the number of zones in sd_zbc_setup().

Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V3 01/12] block: Fix declaration of blk-mq debugfs functions
  2017-09-15 10:06 ` [PATCH V3 01/12] block: Fix declaration of blk-mq debugfs functions Damien Le Moal
  2017-09-15 14:33     ` Bart Van Assche
@ 2017-09-15 17:45   ` Christoph Hellwig
  2017-09-15 22:36     ` Damien Le Moal
  1 sibling, 1 reply; 48+ messages in thread
From: Christoph Hellwig @ 2017-09-15 17:45 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: linux-scsi, Martin K . Petersen, linux-block, Jens Axboe,
	Christoph Hellwig, Bart Van Assche

On Fri, Sep 15, 2017 at 07:06:34PM +0900, Damien Le Moal wrote:
> __blk_mq_debugfs_rq_show() and blk_mq_debugfs_rq_show() are exported
> symbols but ar eonly declared in the block internal file
> block/blk-mq-debugfs.h. which is not cleanly accessible to files outside
> of the block directory.
> Move the declaration of these functions to the new file
> include/linux/blk-mq-debugfs.h file to make the declarations cleanly
> available to other modules.
> 
> While at it, also move the definition of the blk_mq_debugfs_attr
> structure to allow scheduler modules outside of the block directory to
> define debugfs attributes.

I think we should keep them local to block/ where all I/O schedulers
live, so: NAK.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V3 02/12] block: Fix declaration of blk-mq scheduler functions
  2017-09-15 10:06 ` [PATCH V3 02/12] block: Fix declaration of blk-mq scheduler functions Damien Le Moal
  2017-09-15 14:35     ` Bart Van Assche
@ 2017-09-15 17:46   ` Christoph Hellwig
  1 sibling, 0 replies; 48+ messages in thread
From: Christoph Hellwig @ 2017-09-15 17:46 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: linux-scsi, Martin K . Petersen, linux-block, Jens Axboe,
	Christoph Hellwig, Bart Van Assche

Same as for patch 1: this should stay local to block/ - we don't
want random drivers to grow I/O schedulers.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V3 03/12] block: Add zoned block device information to request queue
  2017-09-15 10:06 ` [PATCH V3 03/12] block: Add zoned block device information to request queue Damien Le Moal
  2017-09-15 14:38     ` Bart Van Assche
@ 2017-09-15 17:48   ` Christoph Hellwig
  1 sibling, 0 replies; 48+ messages in thread
From: Christoph Hellwig @ 2017-09-15 17:48 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: linux-scsi, Martin K . Petersen, linux-block, Jens Axboe,
	Christoph Hellwig, Bart Van Assche

> +struct blk_zoned {
> +	unsigned int	nr_zones;
> +	unsigned long	*seq_zones;
> +};
> +
>  struct blk_zone_report_hdr {
>  	unsigned int	nr_zones;
>  	u8		padding[60];
> @@ -492,6 +497,10 @@ struct request_queue {
>  	struct blk_integrity integrity;
>  #endif	/* CONFIG_BLK_DEV_INTEGRITY */
>  
> +#ifdef CONFIG_BLK_DEV_ZONED
> +	struct blk_zoned	zoned;
> +#endif

I'd prefer to just add the two fields direct to struct request_queue
instead of the container structure.

> +static inline unsigned int blk_queue_nr_zones(struct request_queue *q)
> +{
> +	return blk_queue_is_zoned(q) ? q->zoned.nr_zones : 0;
> +}

I don't think this is going to compile without CONFIG_BLK_DEV_ZONED,
we'd either need two versions of it, or just always add nr_zones to
the request queue.

With nr_zones always present we could then remove the first check,
too as it would always be initialized to zero.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V3 04/12] scsi: sd_zbc: Move ZBC declarations to scsi_proto.h
  2017-09-15 10:06 ` [PATCH V3 04/12] scsi: sd_zbc: Move ZBC declarations to scsi_proto.h Damien Le Moal
@ 2017-09-15 17:48   ` Christoph Hellwig
  0 siblings, 0 replies; 48+ messages in thread
From: Christoph Hellwig @ 2017-09-15 17:48 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: linux-scsi, Martin K . Petersen, linux-block, Jens Axboe,
	Christoph Hellwig, Bart Van Assche

Looks fine,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V3 06/12] scsi: sd_zbc: Rearrange code
  2017-09-15 10:06 ` [PATCH V3 06/12] scsi: sd_zbc: Rearrange code Damien Le Moal
  2017-09-15 10:45   ` Hannes Reinecke
  2017-09-15 14:51     ` Bart Van Assche
@ 2017-09-15 17:49   ` Christoph Hellwig
  2 siblings, 0 replies; 48+ messages in thread
From: Christoph Hellwig @ 2017-09-15 17:49 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: linux-scsi, Martin K . Petersen, linux-block, Jens Axboe,
	Christoph Hellwig, Bart Van Assche

Looks fine,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V3 06/12] scsi: sd_zbc: Rearrange code
  2017-09-15 14:51     ` Bart Van Assche
  (?)
@ 2017-09-15 17:51     ` hch
  2017-09-15 21:02         ` Bart Van Assche
  -1 siblings, 1 reply; 48+ messages in thread
From: hch @ 2017-09-15 17:51 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-scsi, linux-block, Damien Le Moal, martin.petersen, axboe, hch

On Fri, Sep 15, 2017 at 02:51:03PM +0000, Bart Van Assche wrote:
> On Fri, 2017-09-15 at 19:06 +0900, Damien Le Moal wrote:
> > Rearrange sd_zbc_setup() to include use_16_for_rw and use_10_for_rw
> > assignments and move the calculation of sdkp->zone_shift together
> > with the assignment of the verified zone_blocks value in
> > sd_zbc_check_zone_size().
> 
> Both functions are called from inside block_device_operations.revalidate_disk.
> Isn't that too late to set use_1[06]_for_rw? Shouldn't both be set just after
> the slave_alloc callback has been called?

Hmm.  sd_read_capacity already is called from the same path and seems
to work..

> 
> Thanks,
> 
> Bart.---end quoted text---

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V3 07/12] scsi: sd_zbc: Use well defined macros
  2017-09-15 10:06 ` [PATCH V3 07/12] scsi: sd_zbc: Use well defined macros Damien Le Moal
  2017-09-15 10:45   ` Hannes Reinecke
  2017-09-15 16:27     ` Bart Van Assche
@ 2017-09-15 17:51   ` Christoph Hellwig
  2 siblings, 0 replies; 48+ messages in thread
From: Christoph Hellwig @ 2017-09-15 17:51 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: linux-scsi, Martin K . Petersen, linux-block, Jens Axboe,
	Christoph Hellwig, Bart Van Assche

Looks fine,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V3 08/12] scsi: sd_zbc: Fix sd_zbc_read_zoned_characteristics()
  2017-09-15 10:06 ` [PATCH V3 08/12] scsi: sd_zbc: Fix sd_zbc_read_zoned_characteristics() Damien Le Moal
  2017-09-15 10:46   ` Hannes Reinecke
@ 2017-09-15 17:52   ` Christoph Hellwig
  1 sibling, 0 replies; 48+ messages in thread
From: Christoph Hellwig @ 2017-09-15 17:52 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: linux-scsi, Martin K . Petersen, linux-block, Jens Axboe,
	Christoph Hellwig, Bart Van Assche, stable

Looks fine,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V3 06/12] scsi: sd_zbc: Rearrange code
  2017-09-15 17:51     ` hch
@ 2017-09-15 21:02         ` Bart Van Assche
  0 siblings, 0 replies; 48+ messages in thread
From: Bart Van Assche @ 2017-09-15 21:02 UTC (permalink / raw)
  To: hch; +Cc: linux-scsi, linux-block, Damien Le Moal, martin.petersen, axboe

T24gRnJpLCAyMDE3LTA5LTE1IGF0IDE5OjUxICswMjAwLCBoY2hAbHN0LmRlIHdyb3RlOg0KPiBP
biBGcmksIFNlcCAxNSwgMjAxNyBhdCAwMjo1MTowM1BNICswMDAwLCBCYXJ0IFZhbiBBc3NjaGUg
d3JvdGU6DQo+ID4gT24gRnJpLCAyMDE3LTA5LTE1IGF0IDE5OjA2ICswOTAwLCBEYW1pZW4gTGUg
TW9hbCB3cm90ZToNCj4gPiA+IFJlYXJyYW5nZSBzZF96YmNfc2V0dXAoKSB0byBpbmNsdWRlIHVz
ZV8xNl9mb3JfcncgYW5kIHVzZV8xMF9mb3JfcncNCj4gPiA+IGFzc2lnbm1lbnRzIGFuZCBtb3Zl
IHRoZSBjYWxjdWxhdGlvbiBvZiBzZGtwLT56b25lX3NoaWZ0IHRvZ2V0aGVyDQo+ID4gPiB3aXRo
IHRoZSBhc3NpZ25tZW50IG9mIHRoZSB2ZXJpZmllZCB6b25lX2Jsb2NrcyB2YWx1ZSBpbg0KPiA+
ID4gc2RfemJjX2NoZWNrX3pvbmVfc2l6ZSgpLg0KPiA+IA0KPiA+IEJvdGggZnVuY3Rpb25zIGFy
ZSBjYWxsZWQgZnJvbSBpbnNpZGUgYmxvY2tfZGV2aWNlX29wZXJhdGlvbnMucmV2YWxpZGF0ZV9k
aXNrLg0KPiA+IElzbid0IHRoYXQgdG9vIGxhdGUgdG8gc2V0IHVzZV8xWzA2XV9mb3Jfcnc/IFNo
b3VsZG4ndCBib3RoIGJlIHNldCBqdXN0IGFmdGVyDQo+ID4gdGhlIHNsYXZlX2FsbG9jIGNhbGxi
YWNrIGhhcyBiZWVuIGNhbGxlZD8NCj4gDQo+IEhtbS4gIHNkX3JlYWRfY2FwYWNpdHkgYWxyZWFk
eSBpcyBjYWxsZWQgZnJvbSB0aGUgc2FtZSBwYXRoIGFuZCBzZWVtcw0KPiB0byB3b3JrLi4NCg0K
SGVsbG8gQ2hyaXN0b3BoLA0KDQpNeSBjb25jZXJuIGlzIHRoYXQgaWYgYm90aCBhIFNDU0kgTExE
IGFuZCB0aGUgWkJDIGNvZGUgc2V0IHVzZV8xWzA2XV9mb3JfcncNCnRoYXQgdGhlc2Ugc2V0dGlu
Z3MgbWF5IGNvbmZsaWN0LiBTaG91bGQgdGhlIFpCQyBjb2RlIGUuZy4gcmVmdXNlIHRoYXQgdGhl
IHNkDQpkcml2ZXIgaXMgYXR0YWNoZWQgdG8gYSBaQkMgZGlzayBjb250cm9sbGVkIGJ5IGEgU0NT
SSBMTEQgdGhhdCBzdXBwb3J0cw0KdXNlXzEwX2Zvcl9ydyBidXQgbm90IHVzZV8xNl9mb3Jfcnc/
DQoNClRoYW5rcywNCg0KQmFydC4=

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V3 06/12] scsi: sd_zbc: Rearrange code
@ 2017-09-15 21:02         ` Bart Van Assche
  0 siblings, 0 replies; 48+ messages in thread
From: Bart Van Assche @ 2017-09-15 21:02 UTC (permalink / raw)
  To: hch; +Cc: linux-scsi, linux-block, Damien Le Moal, martin.petersen, axboe

On Fri, 2017-09-15 at 19:51 +0200, hch@lst.de wrote:
> On Fri, Sep 15, 2017 at 02:51:03PM +0000, Bart Van Assche wrote:
> > On Fri, 2017-09-15 at 19:06 +0900, Damien Le Moal wrote:
> > > Rearrange sd_zbc_setup() to include use_16_for_rw and use_10_for_rw
> > > assignments and move the calculation of sdkp->zone_shift together
> > > with the assignment of the verified zone_blocks value in
> > > sd_zbc_check_zone_size().
> > 
> > Both functions are called from inside block_device_operations.revalidate_disk.
> > Isn't that too late to set use_1[06]_for_rw? Shouldn't both be set just after
> > the slave_alloc callback has been called?
> 
> Hmm.  sd_read_capacity already is called from the same path and seems
> to work..

Hello Christoph,

My concern is that if both a SCSI LLD and the ZBC code set use_1[06]_for_rw
that these settings may conflict. Should the ZBC code e.g. refuse that the sd
driver is attached to a ZBC disk controlled by a SCSI LLD that supports
use_10_for_rw but not use_16_for_rw?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V3 06/12] scsi: sd_zbc: Rearrange code
  2017-09-15 21:02         ` Bart Van Assche
  (?)
@ 2017-09-15 22:35         ` Damien Le Moal
  2017-09-16  2:40             ` Bart Van Assche
  -1 siblings, 1 reply; 48+ messages in thread
From: Damien Le Moal @ 2017-09-15 22:35 UTC (permalink / raw)
  To: Bart Van Assche, hch; +Cc: linux-scsi, linux-block, martin.petersen, axboe

On 9/16/17 06:02, Bart Van Assche wrote:
> On Fri, 2017-09-15 at 19:51 +0200, hch@lst.de wrote:
>> On Fri, Sep 15, 2017 at 02:51:03PM +0000, Bart Van Assche wrote:
>>> On Fri, 2017-09-15 at 19:06 +0900, Damien Le Moal wrote:
>>>> Rearrange sd_zbc_setup() to include use_16_for_rw and use_10_for_rw
>>>> assignments and move the calculation of sdkp->zone_shift together
>>>> with the assignment of the verified zone_blocks value in
>>>> sd_zbc_check_zone_size().
>>>
>>> Both functions are called from inside block_device_operations.revalidate_disk.
>>> Isn't that too late to set use_1[06]_for_rw? Shouldn't both be set just after
>>> the slave_alloc callback has been called?
>>
>> Hmm.  sd_read_capacity already is called from the same path and seems
>> to work..
> 
> Hello Christoph,
> 
> My concern is that if both a SCSI LLD and the ZBC code set use_1[06]_for_rw
> that these settings may conflict. Should the ZBC code e.g. refuse that the sd
> driver is attached to a ZBC disk controlled by a SCSI LLD that supports
> use_10_for_rw but not use_16_for_rw?

rw16 is mandatory for ZBC drives. So it has to be set to true. If the
HBA does not support rw16 (why would that happen ?), then the disk
should not be used.

Cheers.

-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V3 01/12] block: Fix declaration of blk-mq debugfs functions
  2017-09-15 17:45   ` Christoph Hellwig
@ 2017-09-15 22:36     ` Damien Le Moal
  0 siblings, 0 replies; 48+ messages in thread
From: Damien Le Moal @ 2017-09-15 22:36 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-scsi, Martin K . Petersen, linux-block, Jens Axboe,
	Bart Van Assche

On 9/16/17 02:45, Christoph Hellwig wrote:
> On Fri, Sep 15, 2017 at 07:06:34PM +0900, Damien Le Moal wrote:
>> __blk_mq_debugfs_rq_show() and blk_mq_debugfs_rq_show() are exported
>> symbols but ar eonly declared in the block internal file
>> block/blk-mq-debugfs.h. which is not cleanly accessible to files outside
>> of the block directory.
>> Move the declaration of these functions to the new file
>> include/linux/blk-mq-debugfs.h file to make the declarations cleanly
>> available to other modules.
>>
>> While at it, also move the definition of the blk_mq_debugfs_attr
>> structure to allow scheduler modules outside of the block directory to
>> define debugfs attributes.
> 
> I think we should keep them local to block/ where all I/O schedulers
> live, so: NAK.

OK. I will drop this one and 02/12.

Thanks.

-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V3 05/12] scsi: sd_zbc: Fix comments and indentation
  2017-09-15 10:44   ` Hannes Reinecke
@ 2017-09-15 22:48     ` Damien Le Moal
  0 siblings, 0 replies; 48+ messages in thread
From: Damien Le Moal @ 2017-09-15 22:48 UTC (permalink / raw)
  To: Hannes Reinecke, linux-scsi, Martin K . Petersen, linux-block,
	Jens Axboe
  Cc: Christoph Hellwig, Bart Van Assche

On 9/15/17 19:44, Hannes Reinecke wrote:
> On 09/15/2017 12:06 PM, Damien Le Moal wrote:
>> Fix comments style (do not use documented comment style) and add some
>> comments to clarify some functions. Also fix some functions signature
>> indentation and remove a useless blank line in sd_zbc_read_zones().
>>
>> No functional change is introduced by this patch.
>>
>> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
>> ---
>>  drivers/scsi/scsi_lib.c |  5 +++-
>>  drivers/scsi/sd_zbc.c   | 67 ++++++++++++++++++++++++++++++++++++-------------
>>  2 files changed, 53 insertions(+), 19 deletions(-)
>>
> Actually you should've used kernel-doc style to format the comments;
> just to keep the 0-day bot happy....

I thought that comments to functions with the kernel-doc format was only
if you want the function to be documented. That generally includes at
least all the exported functions.
In this case, the functions are mostly local to sd_zbc.c... I thought
the documentation of those was not very useful.

I can revert though. No problem at all.

Thanks !

-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V3 06/12] scsi: sd_zbc: Rearrange code
  2017-09-15 22:35         ` Damien Le Moal
@ 2017-09-16  2:40             ` Bart Van Assche
  0 siblings, 0 replies; 48+ messages in thread
From: Bart Van Assche @ 2017-09-16  2:40 UTC (permalink / raw)
  To: hch, Damien Le Moal; +Cc: linux-scsi, linux-block, martin.petersen, axboe

T24gU2F0LCAyMDE3LTA5LTE2IGF0IDA3OjM1ICswOTAwLCBEYW1pZW4gTGUgTW9hbCB3cm90ZToN
Cj4gcncxNiBpcyBtYW5kYXRvcnkgZm9yIFpCQyBkcml2ZXMuIFNvIGl0IGhhcyB0byBiZSBzZXQg
dG8gdHJ1ZS4gSWYgdGhlDQo+IEhCQSBkb2VzIG5vdCBzdXBwb3J0IHJ3MTYgKHdoeSB3b3VsZCB0
aGF0IGhhcHBlbiA/KSwgdGhlbiB0aGUgZGlzaw0KPiBzaG91bGQgbm90IGJlIHVzZWQuDQoNCkl0
J3MgZ29vZCB0aGF0IGFsbCBIQkFzIHN1cHBvcnQgcncxNi4gQnV0IGl0J3Mgbm9udHJpdmlhbCB0
byBhbmFseXplIHdoZXRoZXINCm9yIG5vdCB1c2VfMVswNl1fZm9yX3J3IGFyZSBzZXQgYmVmb3Jl
IHRoZXNlIGFyZSB1c2VkIHNvIG1heWJlIHdlIHNob3VsZCB0aGluaw0KYWJvdXQgaG93IHdlIGNh
biByZXN0cnVjdHVyZSB0aGUgU0NTSSBjb2RlIHN1Y2ggdGhhdCBpdCBiZWNvbWVzIGVhc2llciB0
byB2ZXJpZnkNCnRoYXQgdXNlXzFbMDZdX2Zvcl9ydyBhcmUgc2V0IGJlZm9yZSB0aGVzZSBhcmUg
dXNlZC4NCg0KQmFydC4=

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V3 06/12] scsi: sd_zbc: Rearrange code
@ 2017-09-16  2:40             ` Bart Van Assche
  0 siblings, 0 replies; 48+ messages in thread
From: Bart Van Assche @ 2017-09-16  2:40 UTC (permalink / raw)
  To: hch, Damien Le Moal; +Cc: linux-scsi, linux-block, martin.petersen, axboe

On Sat, 2017-09-16 at 07:35 +0900, Damien Le Moal wrote:
> rw16 is mandatory for ZBC drives. So it has to be set to true. If the
> HBA does not support rw16 (why would that happen ?), then the disk
> should not be used.

It's good that all HBAs support rw16. But it's nontrivial to analyze whether
or not use_1[06]_for_rw are set before these are used so maybe we should think
about how we can restructure the SCSI code such that it becomes easier to verify
that use_1[06]_for_rw are set before these are used.

Bart.

^ permalink raw reply	[flat|nested] 48+ messages in thread

end of thread, other threads:[~2017-09-16  2:40 UTC | newest]

Thread overview: 48+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-15 10:06 [PATCH V3 00/12] scsi-mq support for ZBC disks Damien Le Moal
2017-09-15 10:06 ` [PATCH V3 01/12] block: Fix declaration of blk-mq debugfs functions Damien Le Moal
2017-09-15 14:33   ` Bart Van Assche
2017-09-15 14:33     ` Bart Van Assche
2017-09-15 17:45   ` Christoph Hellwig
2017-09-15 22:36     ` Damien Le Moal
2017-09-15 10:06 ` [PATCH V3 02/12] block: Fix declaration of blk-mq scheduler functions Damien Le Moal
2017-09-15 14:35   ` Bart Van Assche
2017-09-15 14:35     ` Bart Van Assche
2017-09-15 17:46   ` Christoph Hellwig
2017-09-15 10:06 ` [PATCH V3 03/12] block: Add zoned block device information to request queue Damien Le Moal
2017-09-15 14:38   ` Bart Van Assche
2017-09-15 14:38     ` Bart Van Assche
2017-09-15 17:48   ` Christoph Hellwig
2017-09-15 10:06 ` [PATCH V3 04/12] scsi: sd_zbc: Move ZBC declarations to scsi_proto.h Damien Le Moal
2017-09-15 17:48   ` Christoph Hellwig
2017-09-15 10:06 ` [PATCH V3 05/12] scsi: sd_zbc: Fix comments and indentation Damien Le Moal
2017-09-15 10:44   ` Hannes Reinecke
2017-09-15 22:48     ` Damien Le Moal
2017-09-15 10:06 ` [PATCH V3 06/12] scsi: sd_zbc: Rearrange code Damien Le Moal
2017-09-15 10:45   ` Hannes Reinecke
2017-09-15 14:51   ` Bart Van Assche
2017-09-15 14:51     ` Bart Van Assche
2017-09-15 17:51     ` hch
2017-09-15 21:02       ` Bart Van Assche
2017-09-15 21:02         ` Bart Van Assche
2017-09-15 22:35         ` Damien Le Moal
2017-09-16  2:40           ` Bart Van Assche
2017-09-16  2:40             ` Bart Van Assche
2017-09-15 17:49   ` Christoph Hellwig
2017-09-15 10:06 ` [PATCH V3 07/12] scsi: sd_zbc: Use well defined macros Damien Le Moal
2017-09-15 10:45   ` Hannes Reinecke
2017-09-15 16:27   ` Bart Van Assche
2017-09-15 16:27     ` Bart Van Assche
2017-09-15 17:51   ` Christoph Hellwig
2017-09-15 10:06 ` [PATCH V3 08/12] scsi: sd_zbc: Fix sd_zbc_read_zoned_characteristics() Damien Le Moal
2017-09-15 10:46   ` Hannes Reinecke
2017-09-15 17:52   ` Christoph Hellwig
2017-09-15 10:06 ` [PATCH V3 09/12] scsi: sd_zbc: Initialize device queue zoned structure Damien Le Moal
2017-09-15 10:49   ` Hannes Reinecke
2017-09-15 10:06 ` [PATCH V3 10/12] scsi: sd_zbc: Limit zone write locking to sequential zones Damien Le Moal
2017-09-15 10:52   ` Hannes Reinecke
2017-09-15 16:23   ` Bart Van Assche
2017-09-15 16:23     ` Bart Van Assche
2017-09-15 10:06 ` [PATCH V3 11/12] scsi: sd_zbc: Disable zone write locking with scsi-mq Damien Le Moal
2017-09-15 10:52   ` Hannes Reinecke
2017-09-15 10:06 ` [PATCH V3 12/12] block: Introduce zoned I/O scheduler Damien Le Moal
2017-09-15 10:55   ` Hannes Reinecke

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.