All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V5 00/14] scsi-mq support for ZBC disks
@ 2017-09-25  6:14 Damien Le Moal
  2017-09-25  6:14 ` [PATCH V5 01/14] scsi: sd_zbc: Move ZBC declarations to scsi_proto.h Damien Le Moal
                   ` (13 more replies)
  0 siblings, 14 replies; 40+ messages in thread
From: Damien Le Moal @ 2017-09-25  6:14 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, linux-block, Jens Axboe
  Cc: Christoph Hellwig, Bart Van Assche

This series implements support for ZBC disks used through the scsi-mq I/O path.

The current scsi level support of ZBC disks guarantees write request ordering
using a per-zone write lock which prevents issuing simultaneously multiple
write commands to a zone, doing so avoid reordering of sequential writes to
sequential zones. This method is however ineffective when scsi-mq is used with
zoned block devices. This is due to the different execution model of blk-mq
which passes a request to the scsi layer for dispatching after the request has
been removed from the I/O scheduler queue. That is, when the scsi layer tries
to lock the target zone of the request, the request may already be out of
order and zone write locking fails to prevent that.

Various approaches have been tried to solve this problem. All of them had the
serious disadvantage of cluttering blk-mq code with zoned block device specific
conditions and processing. As such extensive changes can only turn into a
maintenance nightmares, a radically different solution is proposed here.

This series proposes implementing scsi-mq support for zoned block devices at
the I/O scheduler level with simple modifications of the mq-deadline scheduler.
the modifications are the addition of a per zone write locking mechanism
similar to that implemented in sd_zbc.c for the legacy scsi path. The zone
write locking mechanism is used for the exact same purpose, that is, to limit
writes per zone to at most one request to avoid reordering. The locking context
however changes from that of scsi-sq and is moved to the dispatch_request
method of the scheduler. Within this context, under a spin lock guaranteeing
atomicity against other dispatch contexts, target zones of write requests can
be locked before write requests removal from the scheduler. In effect, this
results in the same behavior as the legacy scsi path. Sequential write ordering
is preserved.

The changes to mq-deadline do not affect regular disks: the same scheduling
behavior is maintained for these. The modification are also optimized to not
lock conventional zones. To do so, additional data is introduced in the
request queue structure so that the low level scsi code can pass upward
information such as the total number of zones and zone types of the device.
The availability of this new data avoids difficulties in accessing this
information from the I/O scheduler initialization method (init_queue() method)
context.

Of note is that the last patch of this series removes setting mq-deadline as the
default scheduler for block devices with a single hardware queue. The reason for
this is that setting the default scheduler is done very early in the device
initialization sequence, when the disk characteristics are not yet known. This
results in mq-deadline not correctly setting the default zones write locking
behavior nased on the device zoning model. Setting of a default I/O scheduler
can be done easily with udev rules later in the system initialization process,
leading to correct default settings for zoned block devices.

Comments are as always very much appreciated.

Changes from v4:
* Various fixes and improvements (From Christoph's comments)
* Dropped zones_wlock scheduler tunable attribute

Changes from v3:
* Integrated support directly into mq-deadline instead of creating a new I/O
  scheduler.
* Disable setting of default mq scheduler for single queue devices

Changes from v2:
* Introduced blk_zoned structure
* Moved I/O scheduler from drivers/scsi to block 

Changes from v1:
* Addressed Bart's comments for the blk-mq patches (declarations files)
* Split (former) patch 4 into multiple patches to facilitate review
* Fixed scsi disk lookup from io scheduler by introducing
  scsi_disk_from_queue()

Damien Le Moal (14):
  scsi: sd_zbc: Move ZBC declarations to scsi_proto.h
  scsi: sd_zbc: Fix comments and indentation
  scsi: sd_zbc: Rearrange code
  scsi: sd_zbc: Use well defined macros
  scsi: sd_zbc: Fix sd_zbc_read_zoned_characteristics()
  block: Add zoned block device information to request queue
  scsi: sd_zbc: Initialize device request queue zoned data
  scsi: sd_zbc: Limit zone write locking to sequential zones
  scsi: sd_zbc: Disable zone write locking with scsi-mq
  block: mq-deadline: Add zoned block device data
  blokc: mq-deadline: Introduce dispatch helpers
  block: mq-deadline: Introduce zone locking support
  block: mq-deadline: Limit write request dispatch for zoned block
    devices
  block: do not set mq default scheduler

 block/elevator.c          |  17 +--
 block/mq-deadline.c       | 265 ++++++++++++++++++++++++++++++++++--
 drivers/scsi/scsi_lib.c   |   5 +-
 drivers/scsi/sd_zbc.c     | 340 +++++++++++++++++++++++++++++++++++-----------
 include/linux/blkdev.h    |  53 ++++++++
 include/scsi/scsi_proto.h |  45 ++++--
 6 files changed, 612 insertions(+), 113 deletions(-)

-- 
2.13.5

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH V5 01/14] scsi: sd_zbc: Move ZBC declarations to scsi_proto.h
  2017-09-25  6:14 [PATCH V5 00/14] scsi-mq support for ZBC disks Damien Le Moal
@ 2017-09-25  6:14 ` Damien Le Moal
  2017-09-25  6:14 ` [PATCH V5 02/14] scsi: sd_zbc: Fix comments and indentation Damien Le Moal
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 40+ messages in thread
From: Damien Le Moal @ 2017-09-25  6:14 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, linux-block, Jens Axboe
  Cc: Christoph Hellwig, Bart Van Assche

Move standard macro definitions for the zone types and zone conditions
to scsi_proto.h together with the definitions related to the
REPORT ZONES command. While at it, define all values in the enums to
be clear.

Also remove unnecessary includes in sd_zbc.c.

No functional change is introduced by this patch.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Bart Van Assche <Bart.VanAssche@wdc.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/scsi/sd_zbc.c     | 24 ------------------------
 include/scsi/scsi_proto.h | 45 ++++++++++++++++++++++++++++++++++-----------
 2 files changed, 34 insertions(+), 35 deletions(-)

diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index 8aa54779aac1..692c8cbc7ed8 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -28,32 +28,8 @@
 
 #include <scsi/scsi.h>
 #include <scsi/scsi_cmnd.h>
-#include <scsi/scsi_dbg.h>
-#include <scsi/scsi_device.h>
-#include <scsi/scsi_driver.h>
-#include <scsi/scsi_host.h>
-#include <scsi/scsi_eh.h>
 
 #include "sd.h"
-#include "scsi_priv.h"
-
-enum zbc_zone_type {
-	ZBC_ZONE_TYPE_CONV = 0x1,
-	ZBC_ZONE_TYPE_SEQWRITE_REQ,
-	ZBC_ZONE_TYPE_SEQWRITE_PREF,
-	ZBC_ZONE_TYPE_RESERVED,
-};
-
-enum zbc_zone_cond {
-	ZBC_ZONE_COND_NO_WP,
-	ZBC_ZONE_COND_EMPTY,
-	ZBC_ZONE_COND_IMP_OPEN,
-	ZBC_ZONE_COND_EXP_OPEN,
-	ZBC_ZONE_COND_CLOSED,
-	ZBC_ZONE_COND_READONLY = 0xd,
-	ZBC_ZONE_COND_FULL,
-	ZBC_ZONE_COND_OFFLINE,
-};
 
 /**
  * Convert a zone descriptor to a zone struct.
diff --git a/include/scsi/scsi_proto.h b/include/scsi/scsi_proto.h
index 8c285d9a06d8..39130a9c05bf 100644
--- a/include/scsi/scsi_proto.h
+++ b/include/scsi/scsi_proto.h
@@ -301,19 +301,42 @@ struct scsi_lun {
 
 /* Reporting options for REPORT ZONES */
 enum zbc_zone_reporting_options {
-	ZBC_ZONE_REPORTING_OPTION_ALL = 0,
-	ZBC_ZONE_REPORTING_OPTION_EMPTY,
-	ZBC_ZONE_REPORTING_OPTION_IMPLICIT_OPEN,
-	ZBC_ZONE_REPORTING_OPTION_EXPLICIT_OPEN,
-	ZBC_ZONE_REPORTING_OPTION_CLOSED,
-	ZBC_ZONE_REPORTING_OPTION_FULL,
-	ZBC_ZONE_REPORTING_OPTION_READONLY,
-	ZBC_ZONE_REPORTING_OPTION_OFFLINE,
-	ZBC_ZONE_REPORTING_OPTION_NEED_RESET_WP = 0x10,
-	ZBC_ZONE_REPORTING_OPTION_NON_SEQWRITE,
-	ZBC_ZONE_REPORTING_OPTION_NON_WP = 0x3f,
+	ZBC_ZONE_REPORTING_OPTION_ALL		= 0x00,
+	ZBC_ZONE_REPORTING_OPTION_EMPTY		= 0x01,
+	ZBC_ZONE_REPORTING_OPTION_IMPLICIT_OPEN	= 0x02,
+	ZBC_ZONE_REPORTING_OPTION_EXPLICIT_OPEN	= 0x03,
+	ZBC_ZONE_REPORTING_OPTION_CLOSED	= 0x04,
+	ZBC_ZONE_REPORTING_OPTION_FULL		= 0x05,
+	ZBC_ZONE_REPORTING_OPTION_READONLY	= 0x06,
+	ZBC_ZONE_REPORTING_OPTION_OFFLINE	= 0x07,
+	/* 0x08 to 0x0f are reserved */
+	ZBC_ZONE_REPORTING_OPTION_NEED_RESET_WP	= 0x10,
+	ZBC_ZONE_REPORTING_OPTION_NON_SEQWRITE	= 0x11,
+	/* 0x12 to 0x3e are reserved */
+	ZBC_ZONE_REPORTING_OPTION_NON_WP	= 0x3f,
 };
 
 #define ZBC_REPORT_ZONE_PARTIAL 0x80
 
+/* Zone types of REPORT ZONES zone descriptors */
+enum zbc_zone_type {
+	ZBC_ZONE_TYPE_CONV		= 0x1,
+	ZBC_ZONE_TYPE_SEQWRITE_REQ	= 0x2,
+	ZBC_ZONE_TYPE_SEQWRITE_PREF	= 0x3,
+	/* 0x4 to 0xf are reserved */
+};
+
+/* Zone conditions of REPORT ZONES zone descriptors */
+enum zbc_zone_cond {
+	ZBC_ZONE_COND_NO_WP		= 0x0,
+	ZBC_ZONE_COND_EMPTY		= 0x1,
+	ZBC_ZONE_COND_IMP_OPEN		= 0x2,
+	ZBC_ZONE_COND_EXP_OPEN		= 0x3,
+	ZBC_ZONE_COND_CLOSED		= 0x4,
+	/* 0x5 to 0xc are reserved */
+	ZBC_ZONE_COND_READONLY		= 0xd,
+	ZBC_ZONE_COND_FULL		= 0xe,
+	ZBC_ZONE_COND_OFFLINE		= 0xf,
+};
+
 #endif /* _SCSI_PROTO_H_ */
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH V5 02/14] scsi: sd_zbc: Fix comments and indentation
  2017-09-25  6:14 [PATCH V5 00/14] scsi-mq support for ZBC disks Damien Le Moal
  2017-09-25  6:14 ` [PATCH V5 01/14] scsi: sd_zbc: Move ZBC declarations to scsi_proto.h Damien Le Moal
@ 2017-09-25  6:14 ` Damien Le Moal
  2017-10-02 23:01     ` Bart Van Assche
  2017-09-25  6:14 ` [PATCH V5 03/14] scsi: sd_zbc: Rearrange code Damien Le Moal
                   ` (11 subsequent siblings)
  13 siblings, 1 reply; 40+ messages in thread
From: Damien Le Moal @ 2017-09-25  6:14 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, linux-block, Jens Axboe
  Cc: Christoph Hellwig, Bart Van Assche

Fix comments style (use kernel-doc style) and content to clarify some
functions. Also fix some functions signature indentation and remove a
useless blank line in sd_zbc_read_zones().

No functional change is introduced by this patch.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/scsi/scsi_lib.c |   5 ++-
 drivers/scsi/sd_zbc.c   | 117 +++++++++++++++++++++++++++++++++++++++++-------
 2 files changed, 104 insertions(+), 18 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 9cf6a80fe297..c72b97a74906 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1752,7 +1752,10 @@ static void scsi_done(struct scsi_cmnd *cmd)
  *
  * Returns:     Nothing
  *
- * Lock status: IO request lock assumed to be held when called.
+ * Lock status: request queue lock assumed to be held when called.
+ *
+ * Note: See sd_zbc.c sd_zbc_write_lock_zone() for write order
+ * protection for ZBC disks.
  */
 static void scsi_request_fn(struct request_queue *q)
 	__releases(q->queue_lock)
diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index 692c8cbc7ed8..023f705ae235 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -32,10 +32,14 @@
 #include "sd.h"
 
 /**
- * Convert a zone descriptor to a zone struct.
+ * sd_zbc_parse_report - Convert a zone descriptor to a struct blk_zone,
+ * @sdkp: The disk the report originated from
+ * @buf: Address of the report zone descriptor
+ * @zone: the destination zone structure
+ *
+ * All LBA sized values are converted to 512B sectors unit.
  */
-static void sd_zbc_parse_report(struct scsi_disk *sdkp,
-				u8 *buf,
+static void sd_zbc_parse_report(struct scsi_disk *sdkp, u8 *buf,
 				struct blk_zone *zone)
 {
 	struct scsi_device *sdp = sdkp->device;
@@ -58,7 +62,13 @@ static void sd_zbc_parse_report(struct scsi_disk *sdkp,
 }
 
 /**
- * Issue a REPORT ZONES scsi command.
+ * sd_zbc_report_zones - Issue a REPORT ZONES scsi command.
+ * @sdkp: The target disk
+ * @buf: Buffer to use for the reply
+ * @buflen: the buffer size
+ * @lba: Start LBA of the report
+ *
+ * For internal use during device validation.
  */
 static int sd_zbc_report_zones(struct scsi_disk *sdkp, unsigned char *buf,
 			       unsigned int buflen, sector_t lba)
@@ -99,6 +109,12 @@ static int sd_zbc_report_zones(struct scsi_disk *sdkp, unsigned char *buf,
 	return 0;
 }
 
+/**
+ * sd_zbc_setup_report_cmnd - Prepare a REPORT ZONES scsi command
+ * @cmd: The command to setup
+ *
+ * Call in sd_init_command() for a REQ_OP_ZONE_REPORT request.
+ */
 int sd_zbc_setup_report_cmnd(struct scsi_cmnd *cmd)
 {
 	struct request *rq = cmd->request;
@@ -141,6 +157,14 @@ int sd_zbc_setup_report_cmnd(struct scsi_cmnd *cmd)
 	return BLKPREP_OK;
 }
 
+/**
+ * sd_zbc_report_zones_complete - Process a REPORT ZONES scsi command reply.
+ * @scmd: The completed report zones command
+ * @good_bytes: reply size in bytes
+ *
+ * Convert all reported zone descriptors to struct blk_zone. The conversion
+ * is done in-place, directly in the request specified sg buffer.
+ */
 static void sd_zbc_report_zones_complete(struct scsi_cmnd *scmd,
 					 unsigned int good_bytes)
 {
@@ -196,17 +220,32 @@ static void sd_zbc_report_zones_complete(struct scsi_cmnd *scmd,
 	local_irq_restore(flags);
 }
 
+/**
+ * sd_zbc_zone_sectors - Get the device zone size in number of 512B sectors.
+ * @sdkp: The target disk
+ */
 static inline sector_t sd_zbc_zone_sectors(struct scsi_disk *sdkp)
 {
 	return logical_to_sectors(sdkp->device, sdkp->zone_blocks);
 }
 
+/**
+ * sd_zbc_zone_no - Get the number of the zone conataining a sector.
+ * @sdkp: The target disk
+ * @sector: 512B sector address contained in the zone
+ */
 static inline unsigned int sd_zbc_zone_no(struct scsi_disk *sdkp,
 					  sector_t sector)
 {
 	return sectors_to_logical(sdkp->device, sector) >> sdkp->zone_shift;
 }
 
+/**
+ * sd_zbc_setup_reset_cmnd - Prepare a RESET WRITE POINTER scsi command.
+ * @cmd: the command to setup
+ *
+ * Called from sd_init_command() for a REQ_OP_ZONE_RESET request.
+ */
 int sd_zbc_setup_reset_cmnd(struct scsi_cmnd *cmd)
 {
 	struct request *rq = cmd->request;
@@ -239,6 +278,23 @@ int sd_zbc_setup_reset_cmnd(struct scsi_cmnd *cmd)
 	return BLKPREP_OK;
 }
 
+/**
+ * sd_zbc_write_lock_zone - Write lock a sequential zone.
+ * @cmd: write command
+ *
+ * Called from sd_init_cmd() for write requests (standard write, write same or
+ * write zeroes operations). If the request target zone is not already locked,
+ * the zone is locked and BLKPREP_OK returned, allowing the request to proceed
+ * through dispatch in scsi_request_fn(). Otherwise, BLKPREP_DEFER is returned,
+ * forcing the request to wait for the zone to be unlocked, that is, for the
+ * previously issued write request targeting the same zone to complete.
+ *
+ * This is called from blk_peek_request() context with the queue lock held and
+ * before the request is removed from the scheduler. As a result, multiple
+ * contexts executing concurrently scsi_request_fn() cannot result in write
+ * sequence reordering as only a single write request per zone is allowed to
+ * proceed.
+ */
 int sd_zbc_write_lock_zone(struct scsi_cmnd *cmd)
 {
 	struct request *rq = cmd->request;
@@ -261,10 +317,7 @@ int sd_zbc_write_lock_zone(struct scsi_cmnd *cmd)
 	 * Do not issue more than one write at a time per
 	 * zone. This solves write ordering problems due to
 	 * the unlocking of the request queue in the dispatch
-	 * path in the non scsi-mq case. For scsi-mq, this
-	 * also avoids potential write reordering when multiple
-	 * threads running on different CPUs write to the same
-	 * zone (with a synchronized sequential pattern).
+	 * path in the non scsi-mq case.
 	 */
 	if (sdkp->zones_wlock &&
 	    test_and_set_bit(zno, sdkp->zones_wlock))
@@ -276,6 +329,13 @@ int sd_zbc_write_lock_zone(struct scsi_cmnd *cmd)
 	return BLKPREP_OK;
 }
 
+/**
+ * sd_zbc_write_unlock_zone - Write unlock a sequential zone.
+ * @cmd: write command
+ *
+ * Called from sd_uninit_cmd(). Unlocking the request target zone will allow
+ * dispatching the next write request for the zone.
+ */
 void sd_zbc_write_unlock_zone(struct scsi_cmnd *cmd)
 {
 	struct request *rq = cmd->request;
@@ -290,8 +350,16 @@ void sd_zbc_write_unlock_zone(struct scsi_cmnd *cmd)
 	}
 }
 
-void sd_zbc_complete(struct scsi_cmnd *cmd,
-		     unsigned int good_bytes,
+/**
+ * sd_zbc_complete - ZBC command post processing.
+ * @cmd: Completed command
+ * @good_bytes: Command reply bytes
+ * @sshdr: command sense header
+ *
+ * Called from sd_done(). Process report zones reply and handle reset zone
+ * and write commands errors.
+ */
+void sd_zbc_complete(struct scsi_cmnd *cmd, unsigned int good_bytes,
 		     struct scsi_sense_hdr *sshdr)
 {
 	int result = cmd->result;
@@ -336,7 +404,11 @@ void sd_zbc_complete(struct scsi_cmnd *cmd,
 }
 
 /**
- * Read zoned block device characteristics (VPD page B6).
+ * sd_zbc_read_zoned_characteristics - Read zoned block device characteristics
+ * @sdkp: Target disk
+ * @buf: Buffer where to store the VPD page data
+ *
+ * Read VPD page B6.
  */
 static int sd_zbc_read_zoned_characteristics(struct scsi_disk *sdkp,
 					     unsigned char *buf)
@@ -366,10 +438,16 @@ static int sd_zbc_read_zoned_characteristics(struct scsi_disk *sdkp,
 }
 
 /**
- * Check reported capacity.
+ * sd_zbc_check_capacity - Check reported capacity.
+ * @sdkp: Target disk
+ * @buf: Buffer to use for commands
+ *
+ * ZBC drive may report only the capacity of the first conventional zones at
+ * LBA 0. This is indicated by the RC_BASIS field of the read capacity reply.
+ * Check this here. If the disk reported only its conventional zones capacity,
+ * get the total capacity by doing a report zones.
  */
-static int sd_zbc_check_capacity(struct scsi_disk *sdkp,
-				 unsigned char *buf)
+static int sd_zbc_check_capacity(struct scsi_disk *sdkp, unsigned char *buf)
 {
 	sector_t lba;
 	int ret;
@@ -399,6 +477,13 @@ static int sd_zbc_check_capacity(struct scsi_disk *sdkp,
 
 #define SD_ZBC_BUF_SIZE 131072
 
+/**
+ * sd_zbc_check_zone_size - Check the device zone sizes
+ * @sdkp: Target disk
+ *
+ * Check that all zones of the device are equal. The last zone can however
+ * be smaller. The zone size must also be a power of two number of LBAs.
+ */
 static int sd_zbc_check_zone_size(struct scsi_disk *sdkp)
 {
 	u64 zone_blocks;
@@ -525,8 +610,7 @@ static int sd_zbc_setup(struct scsi_disk *sdkp)
 	return 0;
 }
 
-int sd_zbc_read_zones(struct scsi_disk *sdkp,
-		      unsigned char *buf)
+int sd_zbc_read_zones(struct scsi_disk *sdkp, unsigned char *buf)
 {
 	int ret;
 
@@ -537,7 +621,6 @@ int sd_zbc_read_zones(struct scsi_disk *sdkp,
 		 */
 		return 0;
 
-
 	/* Get zoned block device characteristics */
 	ret = sd_zbc_read_zoned_characteristics(sdkp, buf);
 	if (ret)
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH V5 03/14] scsi: sd_zbc: Rearrange code
  2017-09-25  6:14 [PATCH V5 00/14] scsi-mq support for ZBC disks Damien Le Moal
  2017-09-25  6:14 ` [PATCH V5 01/14] scsi: sd_zbc: Move ZBC declarations to scsi_proto.h Damien Le Moal
  2017-09-25  6:14 ` [PATCH V5 02/14] scsi: sd_zbc: Fix comments and indentation Damien Le Moal
@ 2017-09-25  6:14 ` Damien Le Moal
  2017-09-25 21:02     ` Bart Van Assche
  2017-09-25  6:14 ` [PATCH V5 04/14] scsi: sd_zbc: Use well defined macros Damien Le Moal
                   ` (10 subsequent siblings)
  13 siblings, 1 reply; 40+ messages in thread
From: Damien Le Moal @ 2017-09-25  6:14 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, linux-block, Jens Axboe
  Cc: Christoph Hellwig, Bart Van Assche

Rearrange sd_zbc_setup() to include use_16_for_rw and use_10_for_rw
assignments and move the calculation of sdkp->zone_shift together
with the assignment of the verified zone_blocks value in
sd_zbc_check_zone_size().

No functional change is introduced by this patch.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/scsi/sd_zbc.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index 023f705ae235..7dbaf920679e 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -584,6 +584,7 @@ static int sd_zbc_check_zone_size(struct scsi_disk *sdkp)
 	}
 
 	sdkp->zone_blocks = zone_blocks;
+	sdkp->zone_shift = ilog2(zone_blocks);
 
 	return 0;
 }
@@ -591,10 +592,13 @@ static int sd_zbc_check_zone_size(struct scsi_disk *sdkp)
 static int sd_zbc_setup(struct scsi_disk *sdkp)
 {
 
+	/* READ16/WRITE16 is mandatory for ZBC disks */
+	sdkp->device->use_16_for_rw = 1;
+	sdkp->device->use_10_for_rw = 0;
+
 	/* chunk_sectors indicates the zone size */
 	blk_queue_chunk_sectors(sdkp->disk->queue,
 			logical_to_sectors(sdkp->device, sdkp->zone_blocks));
-	sdkp->zone_shift = ilog2(sdkp->zone_blocks);
 	sdkp->nr_zones = sdkp->capacity >> sdkp->zone_shift;
 	if (sdkp->capacity & (sdkp->zone_blocks - 1))
 		sdkp->nr_zones++;
@@ -657,10 +661,6 @@ int sd_zbc_read_zones(struct scsi_disk *sdkp, unsigned char *buf)
 	if (ret)
 		goto err;
 
-	/* READ16/WRITE16 is mandatory for ZBC disks */
-	sdkp->device->use_16_for_rw = 1;
-	sdkp->device->use_10_for_rw = 0;
-
 	return 0;
 
 err:
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH V5 04/14] scsi: sd_zbc: Use well defined macros
  2017-09-25  6:14 [PATCH V5 00/14] scsi-mq support for ZBC disks Damien Le Moal
                   ` (2 preceding siblings ...)
  2017-09-25  6:14 ` [PATCH V5 03/14] scsi: sd_zbc: Rearrange code Damien Le Moal
@ 2017-09-25  6:14 ` Damien Le Moal
  2017-09-25  6:14 ` [PATCH V5 05/14] scsi: sd_zbc: Fix sd_zbc_read_zoned_characteristics() Damien Le Moal
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 40+ messages in thread
From: Damien Le Moal @ 2017-09-25  6:14 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, linux-block, Jens Axboe
  Cc: Christoph Hellwig, Bart Van Assche

instead of open coding, use the min() macro to calculate a report zones
reply buffer length in sd_zbc_check_zone_size() and the round_up()
macro for calculating the number of zones in sd_zbc_setup().

No functional change is introduced by this patch.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/scsi/sd_zbc.c | 12 ++++--------
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index 7dbaf920679e..bbad851c1789 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -475,7 +475,7 @@ static int sd_zbc_check_capacity(struct scsi_disk *sdkp, unsigned char *buf)
 	return 0;
 }
 
-#define SD_ZBC_BUF_SIZE 131072
+#define SD_ZBC_BUF_SIZE 131072U
 
 /**
  * sd_zbc_check_zone_size - Check the device zone sizes
@@ -526,10 +526,7 @@ static int sd_zbc_check_zone_size(struct scsi_disk *sdkp)
 		/* Parse REPORT ZONES header */
 		list_length = get_unaligned_be32(&buf[0]) + 64;
 		rec = buf + 64;
-		if (list_length < SD_ZBC_BUF_SIZE)
-			buf_len = list_length;
-		else
-			buf_len = SD_ZBC_BUF_SIZE;
+		buf_len = min(list_length, SD_ZBC_BUF_SIZE);
 
 		/* Parse zone descriptors */
 		while (rec < buf + buf_len) {
@@ -599,9 +596,8 @@ static int sd_zbc_setup(struct scsi_disk *sdkp)
 	/* chunk_sectors indicates the zone size */
 	blk_queue_chunk_sectors(sdkp->disk->queue,
 			logical_to_sectors(sdkp->device, sdkp->zone_blocks));
-	sdkp->nr_zones = sdkp->capacity >> sdkp->zone_shift;
-	if (sdkp->capacity & (sdkp->zone_blocks - 1))
-		sdkp->nr_zones++;
+	sdkp->nr_zones =
+		round_up(sdkp->capacity, sdkp->zone_blocks) >> sdkp->zone_shift;
 
 	if (!sdkp->zones_wlock) {
 		sdkp->zones_wlock = kcalloc(BITS_TO_LONGS(sdkp->nr_zones),
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH V5 05/14] scsi: sd_zbc: Fix sd_zbc_read_zoned_characteristics()
  2017-09-25  6:14 [PATCH V5 00/14] scsi-mq support for ZBC disks Damien Le Moal
                   ` (3 preceding siblings ...)
  2017-09-25  6:14 ` [PATCH V5 04/14] scsi: sd_zbc: Use well defined macros Damien Le Moal
@ 2017-09-25  6:14 ` Damien Le Moal
  2017-09-25  6:14 ` [PATCH V5 06/14] block: Add zoned block device information to request queue Damien Le Moal
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 40+ messages in thread
From: Damien Le Moal @ 2017-09-25  6:14 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, linux-block, Jens Axboe
  Cc: Christoph Hellwig, Bart Van Assche, stable

The three values starting at byte 8 of the Zoned Block Device
Characteristics VPD page B6h are 32 bits values, not 64bits. So use
get_unaligned_be32() to retrieve the values and not get_unaligned_be64()

Fixes: 89d947561077 ("sd: Implement support for ZBC devices")
Cc: <stable@vger.kernel.org>

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Bart Van Assche <Bart.VanAssche@wdc.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/scsi/sd_zbc.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index bbad851c1789..27793b9f54c0 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -423,15 +423,15 @@ static int sd_zbc_read_zoned_characteristics(struct scsi_disk *sdkp,
 	if (sdkp->device->type != TYPE_ZBC) {
 		/* Host-aware */
 		sdkp->urswrz = 1;
-		sdkp->zones_optimal_open = get_unaligned_be64(&buf[8]);
-		sdkp->zones_optimal_nonseq = get_unaligned_be64(&buf[12]);
+		sdkp->zones_optimal_open = get_unaligned_be32(&buf[8]);
+		sdkp->zones_optimal_nonseq = get_unaligned_be32(&buf[12]);
 		sdkp->zones_max_open = 0;
 	} else {
 		/* Host-managed */
 		sdkp->urswrz = buf[4] & 1;
 		sdkp->zones_optimal_open = 0;
 		sdkp->zones_optimal_nonseq = 0;
-		sdkp->zones_max_open = get_unaligned_be64(&buf[16]);
+		sdkp->zones_max_open = get_unaligned_be32(&buf[16]);
 	}
 
 	return 0;
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH V5 06/14] block: Add zoned block device information to request queue
  2017-09-25  6:14 [PATCH V5 00/14] scsi-mq support for ZBC disks Damien Le Moal
                   ` (4 preceding siblings ...)
  2017-09-25  6:14 ` [PATCH V5 05/14] scsi: sd_zbc: Fix sd_zbc_read_zoned_characteristics() Damien Le Moal
@ 2017-09-25  6:14 ` Damien Le Moal
  2017-09-25 10:05   ` Ming Lei
  2017-09-25 21:06     ` Bart Van Assche
  2017-09-25  6:14 ` [PATCH V5 07/14] scsi: sd_zbc: Initialize device request queue zoned data Damien Le Moal
                   ` (7 subsequent siblings)
  13 siblings, 2 replies; 40+ messages in thread
From: Damien Le Moal @ 2017-09-25  6:14 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, linux-block, Jens Axboe
  Cc: Christoph Hellwig, Bart Van Assche

Components relying only on the requeuest_queue structure for accessing
block devices (e.g. I/O schedulers) have a limited knowledged of the
device characteristics. In particular, the device capacity cannot be
easily discovered, which for a zoned block device also result in the
inability to easily know the number of zones of the device (the zone
size is indicated by the chunk_sectors field of the queue limits).

Introduce the nr_zones field to the request_queue sturcture to simplify
access to this information. Also, add the bitmap seq_zone_bitmap which
indicates which zones of the device are sequential zones (write
preferred or write required). These two fields are initialized by the
low level block device driver (sd.c for ZBC/ZAC disks). They are not
initialized by stacking drivers (device mappers) handling zoned block
devices (e.g. dm-linear).

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 include/linux/blkdev.h | 53 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 53 insertions(+)

diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 460294bb0fa5..90285f39030d 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -544,6 +544,18 @@ struct request_queue {
 	struct queue_limits	limits;
 
 	/*
+	 * Zoned block device information for mq I/O schedulers.
+	 * nr_zones is the total number of zones of the device. This is always
+	 * 0 for regular block devices. seq_zone_bitmap is a bitmap of nr_zones
+	 * bits which indicates if a zone is conventional (bit clear) or
+	 * sequential (bit set). Both nr_zones and seq_zone_bitmap are set
+	 * by the low level device driver. Stacking drivers (device mappers)
+	 * may or may not initialize these fields.
+	 */
+	unsigned int	nr_zones;
+	unsigned long	*seq_zone_bitmap;
+
+	/*
 	 * sg stuff
 	 */
 	unsigned int		sg_timeout;
@@ -785,6 +797,27 @@ static inline unsigned int blk_queue_zone_sectors(struct request_queue *q)
 	return blk_queue_is_zoned(q) ? q->limits.chunk_sectors : 0;
 }
 
+static inline unsigned int blk_queue_nr_zones(struct request_queue *q)
+{
+	return q->nr_zones;
+}
+
+static inline unsigned int blk_queue_zone_no(struct request_queue *q,
+					     sector_t sector)
+{
+	if (!blk_queue_is_zoned(q))
+		return 0;
+	return sector >> ilog2(q->limits.chunk_sectors);
+}
+
+static inline bool blk_queue_zone_is_seq(struct request_queue *q,
+					 sector_t sector)
+{
+	if (!blk_queue_is_zoned(q) || !q->seq_zone_bitmap)
+		return false;
+	return test_bit(blk_queue_zone_no(q, sector), q->seq_zone_bitmap);
+}
+
 static inline bool rq_is_sync(struct request *rq)
 {
 	return op_is_sync(rq->cmd_flags);
@@ -1031,6 +1064,16 @@ static inline unsigned int blk_rq_cur_sectors(const struct request *rq)
 	return blk_rq_cur_bytes(rq) >> 9;
 }
 
+static inline unsigned int blk_rq_zone_no(struct request *rq)
+{
+	return blk_queue_zone_no(rq->q, blk_rq_pos(rq));
+}
+
+static inline unsigned int blk_rq_zone_is_seq(struct request *rq)
+{
+	return blk_queue_zone_is_seq(rq->q, blk_rq_pos(rq));
+}
+
 /*
  * Some commands like WRITE SAME have a payload or data transfer size which
  * is different from the size of the request.  Any driver that supports such
@@ -1582,6 +1625,16 @@ static inline unsigned int bdev_zone_sectors(struct block_device *bdev)
 	return 0;
 }
 
+static inline unsigned int bdev_nr_zones(struct block_device *bdev)
+{
+	struct request_queue *q = bdev_get_queue(bdev);
+
+	if (q)
+		return blk_queue_nr_zones(q);
+
+	return 0;
+}
+
 static inline int queue_dma_alignment(struct request_queue *q)
 {
 	return q ? q->dma_alignment : 511;
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH V5 07/14] scsi: sd_zbc: Initialize device request queue zoned data
  2017-09-25  6:14 [PATCH V5 00/14] scsi-mq support for ZBC disks Damien Le Moal
                   ` (5 preceding siblings ...)
  2017-09-25  6:14 ` [PATCH V5 06/14] block: Add zoned block device information to request queue Damien Le Moal
@ 2017-09-25  6:14 ` Damien Le Moal
  2017-09-25 21:17     ` Bart Van Assche
  2017-09-25  6:14 ` [PATCH V5 08/14] scsi: sd_zbc: Limit zone write locking to sequential zones Damien Le Moal
                   ` (6 subsequent siblings)
  13 siblings, 1 reply; 40+ messages in thread
From: Damien Le Moal @ 2017-09-25  6:14 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, linux-block, Jens Axboe
  Cc: Christoph Hellwig, Bart Van Assche

Initialize the seq_zone_bitmap and nr_zones fields of the disk request
queue on disk revalidate. As the seq_zone_bitmap allocation is
identical to the allocation of the zone write lock bitmap, introduce
the helper sd_zbc_alloc_zone_bitmap(). Using this helper, wait for the
disk capacity and number of zones to stabilize on the second
revalidation pass to allocate and initialize the bitmaps.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
---
 drivers/scsi/sd_zbc.c | 144 ++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 139 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index 27793b9f54c0..cc64fada9cd9 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -586,8 +586,127 @@ static int sd_zbc_check_zone_size(struct scsi_disk *sdkp)
 	return 0;
 }
 
+/**
+ * sd_zbc_alloc_zone_bitmap - Allocate a zone bitmap (one bit per zone).
+ * @sdkp: The disk of the bitmap
+ */
+static inline unsigned long *sd_zbc_alloc_zone_bitmap(struct scsi_disk *sdkp)
+{
+	struct request_queue *q = sdkp->disk->queue;
+
+	return kzalloc_node(BITS_TO_LONGS(sdkp->nr_zones)
+			    * sizeof(unsigned long),
+			    GFP_KERNEL, q->node);
+}
+
+/**
+ * sd_zbc_get_seq_zones - Parse report zones reply to identify sequential zones
+ * @sdkp: disk used
+ * @buf: report reply buffer
+ * @seq_zone_bitamp: bitmap of sequential zones to set
+ * @zno: Zone number of the first zone in the report
+ *
+ * Parse reported zone descriptors to find sequiential zones.
+ * Since read-only and offline zones cannot be written, do not
+ * mark them as sequential in the bitmap.
+ * Return the LBA after the last zone reported.
+ */
+static sector_t sd_zbc_get_seq_zones(struct scsi_disk *sdkp, unsigned char *buf,
+				     unsigned int buflen,
+				     unsigned long *seq_zone_bitmap,
+				     unsigned int *zno)
+{
+	sector_t last_lba = sdkp->capacity;
+	unsigned int buf_len, list_length;
+	unsigned int n = *zno;
+	unsigned char *rec;
+	u8 type, cond;
+
+	list_length = get_unaligned_be32(&buf[0]) + 64;
+	buf_len = min(list_length, buflen);
+	rec = buf + 64;
+
+	while (rec < buf + buf_len) {
+		type = rec[0] & 0x0f;
+		cond = (rec[1] >> 4) & 0xf;
+		if (type != ZBC_ZONE_TYPE_CONV &&
+		    cond != ZBC_ZONE_COND_READONLY &&
+		    cond != ZBC_ZONE_COND_OFFLINE)
+			set_bit(n, seq_zone_bitmap);
+		last_lba = get_unaligned_be64(&rec[8]) +
+			get_unaligned_be64(&rec[16]);
+		rec += 64;
+		n++;
+	}
+
+	*zno = n;
+
+	return last_lba;
+}
+
+/**
+ * sd_zbc_setup_seq_zone_bitmap - Initialize the disk seq zone bitmap.
+ * @sdkp: target disk
+ *
+ * Allocate a zone bitmap and initialize it by identifying sequential zones.
+ */
+static int sd_zbc_setup_seq_zone_bitmap(struct scsi_disk *sdkp)
+{
+	struct request_queue *q = sdkp->disk->queue;
+	unsigned long *seq_zone_bitmap;
+	sector_t lba = 0;
+	unsigned char *buf;
+	unsigned int n = 0;
+	int ret = -ENOMEM;
+
+	seq_zone_bitmap = sd_zbc_alloc_zone_bitmap(sdkp);
+	if (!seq_zone_bitmap)
+		return -ENOMEM;
+
+	buf = kmalloc(SD_ZBC_BUF_SIZE, GFP_KERNEL);
+	if (!buf)
+		goto out;
+
+	while (lba < sdkp->capacity) {
+		ret = sd_zbc_report_zones(sdkp, buf, SD_ZBC_BUF_SIZE, lba);
+		if (ret)
+			goto out;
+		lba = sd_zbc_get_seq_zones(sdkp, buf, SD_ZBC_BUF_SIZE,
+					   seq_zone_bitmap, &n);
+	}
+
+	if (n != sdkp->nr_zones) {
+		/* Something went wrong */
+		ret = -EIO;
+	}
+
+out:
+	kfree(buf);
+	if (ret) {
+		kfree(seq_zone_bitmap);
+		return ret;
+	}
+
+	q->seq_zone_bitmap = seq_zone_bitmap;
+
+	return 0;
+}
+
+static void sd_zbc_cleanup(struct scsi_disk *sdkp)
+{
+	struct request_queue *q = sdkp->disk->queue;
+
+	kfree(q->seq_zone_bitmap);
+	q->seq_zone_bitmap = NULL;
+
+	kfree(sdkp->zones_wlock);
+	sdkp->zones_wlock = NULL;
+}
+
 static int sd_zbc_setup(struct scsi_disk *sdkp)
 {
+	struct request_queue *q = sdkp->disk->queue;
+	int ret;
 
 	/* READ16/WRITE16 is mandatory for ZBC disks */
 	sdkp->device->use_16_for_rw = 1;
@@ -599,14 +718,29 @@ static int sd_zbc_setup(struct scsi_disk *sdkp)
 	sdkp->nr_zones =
 		round_up(sdkp->capacity, sdkp->zone_blocks) >> sdkp->zone_shift;
 
+	/*
+	 * Wait for the disk capacity to stabilize before
+	 * initializing zone related information.
+	 */
+	if (sdkp->first_scan)
+		return 0;
+
 	if (!sdkp->zones_wlock) {
-		sdkp->zones_wlock = kcalloc(BITS_TO_LONGS(sdkp->nr_zones),
-					    sizeof(unsigned long),
-					    GFP_KERNEL);
+		sdkp->zones_wlock = sd_zbc_alloc_zone_bitmap(sdkp);
 		if (!sdkp->zones_wlock)
 			return -ENOMEM;
 	}
 
+	if (!q->seq_zone_bitmap) {
+		ret = sd_zbc_setup_seq_zone_bitmap(sdkp);
+		if (ret) {
+			sd_zbc_cleanup(sdkp);
+			return ret;
+		}
+	}
+
+	q->nr_zones = sdkp->nr_zones;
+
 	return 0;
 }
 
@@ -661,14 +795,14 @@ int sd_zbc_read_zones(struct scsi_disk *sdkp, unsigned char *buf)
 
 err:
 	sdkp->capacity = 0;
+	sd_zbc_cleanup(sdkp);
 
 	return ret;
 }
 
 void sd_zbc_remove(struct scsi_disk *sdkp)
 {
-	kfree(sdkp->zones_wlock);
-	sdkp->zones_wlock = NULL;
+	sd_zbc_cleanup(sdkp);
 }
 
 void sd_zbc_print_zones(struct scsi_disk *sdkp)
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH V5 08/14] scsi: sd_zbc: Limit zone write locking to sequential zones
  2017-09-25  6:14 [PATCH V5 00/14] scsi-mq support for ZBC disks Damien Le Moal
                   ` (6 preceding siblings ...)
  2017-09-25  6:14 ` [PATCH V5 07/14] scsi: sd_zbc: Initialize device request queue zoned data Damien Le Moal
@ 2017-09-25  6:14 ` Damien Le Moal
  2017-09-25  6:14 ` [PATCH V5 09/14] scsi: sd_zbc: Disable zone write locking with scsi-mq Damien Le Moal
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 40+ messages in thread
From: Damien Le Moal @ 2017-09-25  6:14 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, linux-block, Jens Axboe
  Cc: Christoph Hellwig, Bart Van Assche

Conventional zones of zoned block devices have no write constraints.
Write locking of conventional zones is thus not necessary and can even
hurt performance by unnecessarily operating the disk under low queue
depth. To avoid this, use the disk request queue seq_zone_bitmap to
allow any write to be issued to conventional zones, locking only
sequential zones.

While at it, remove the helper sd_zbc_zone_no() and use
blk_rq_zone_no() instead.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
---
 drivers/scsi/sd_zbc.c | 32 ++++++++++++--------------------
 1 file changed, 12 insertions(+), 20 deletions(-)

diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index cc64fada9cd9..12d663614099 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -230,17 +230,6 @@ static inline sector_t sd_zbc_zone_sectors(struct scsi_disk *sdkp)
 }
 
 /**
- * sd_zbc_zone_no - Get the number of the zone conataining a sector.
- * @sdkp: The target disk
- * @sector: 512B sector address contained in the zone
- */
-static inline unsigned int sd_zbc_zone_no(struct scsi_disk *sdkp,
-					  sector_t sector)
-{
-	return sectors_to_logical(sdkp->device, sector) >> sdkp->zone_shift;
-}
-
-/**
  * sd_zbc_setup_reset_cmnd - Prepare a RESET WRITE POINTER scsi command.
  * @cmd: the command to setup
  *
@@ -301,7 +290,6 @@ int sd_zbc_write_lock_zone(struct scsi_cmnd *cmd)
 	struct scsi_disk *sdkp = scsi_disk(rq->rq_disk);
 	sector_t sector = blk_rq_pos(rq);
 	sector_t zone_sectors = sd_zbc_zone_sectors(sdkp);
-	unsigned int zno = sd_zbc_zone_no(sdkp, sector);
 
 	/*
 	 * Note: Checks of the alignment of the write command on
@@ -309,18 +297,21 @@ int sd_zbc_write_lock_zone(struct scsi_cmnd *cmd)
 	 */
 
 	/* Do not allow zone boundaries crossing on host-managed drives */
-	if (blk_queue_zoned_model(sdkp->disk->queue) == BLK_ZONED_HM &&
+	if (blk_queue_zoned_model(rq->q) == BLK_ZONED_HM &&
 	    (sector & (zone_sectors - 1)) + blk_rq_sectors(rq) > zone_sectors)
 		return BLKPREP_KILL;
 
 	/*
-	 * Do not issue more than one write at a time per
-	 * zone. This solves write ordering problems due to
-	 * the unlocking of the request queue in the dispatch
-	 * path in the non scsi-mq case.
+	 * There is no write constraints on conventional zones. So any write
+	 * command can be sent. But do not issue more than one write command
+	 * at a time per sequential zone. This avoids write ordering problems
+	 * due to the unlocking of the request queue in the dispatch path of
+	 * legacy scsi path, as well as at the HBA level (e.g. AHCI).
 	 */
+	if (!blk_rq_zone_is_seq(rq))
+		return BLKPREP_OK;
 	if (sdkp->zones_wlock &&
-	    test_and_set_bit(zno, sdkp->zones_wlock))
+	    test_and_set_bit(blk_rq_zone_no(rq), sdkp->zones_wlock))
 		return BLKPREP_DEFER;
 
 	WARN_ON_ONCE(cmd->flags & SCMD_ZONE_WRITE_LOCK);
@@ -341,8 +332,9 @@ void sd_zbc_write_unlock_zone(struct scsi_cmnd *cmd)
 	struct request *rq = cmd->request;
 	struct scsi_disk *sdkp = scsi_disk(rq->rq_disk);
 
-	if (sdkp->zones_wlock && cmd->flags & SCMD_ZONE_WRITE_LOCK) {
-		unsigned int zno = sd_zbc_zone_no(sdkp, blk_rq_pos(rq));
+	if (cmd->flags & SCMD_ZONE_WRITE_LOCK) {
+		unsigned int zno = blk_rq_zone_no(rq);
+
 		WARN_ON_ONCE(!test_bit(zno, sdkp->zones_wlock));
 		cmd->flags &= ~SCMD_ZONE_WRITE_LOCK;
 		clear_bit_unlock(zno, sdkp->zones_wlock);
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH V5 09/14] scsi: sd_zbc: Disable zone write locking with scsi-mq
  2017-09-25  6:14 [PATCH V5 00/14] scsi-mq support for ZBC disks Damien Le Moal
                   ` (7 preceding siblings ...)
  2017-09-25  6:14 ` [PATCH V5 08/14] scsi: sd_zbc: Limit zone write locking to sequential zones Damien Le Moal
@ 2017-09-25  6:14 ` Damien Le Moal
  2017-09-25  6:14 ` [PATCH V5 10/14] block: mq-deadline: Add zoned block device data Damien Le Moal
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 40+ messages in thread
From: Damien Le Moal @ 2017-09-25  6:14 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, linux-block, Jens Axboe
  Cc: Christoph Hellwig, Bart Van Assche

In the case of a ZBC disk used with scsi-mq, zone write locking does
not prevent write reordering in sequential zones. Unlike the legacy
case, zone locking is done after the command request is removed from
the scheduler dispatch queue. That is, at the time of zone locking,
the write command may already be out of order, making locking
ineffective. Write order guarantees can only be provided by an
adapted I/O scheduler.

Disable zone write locking in sd_zbc_write_lock_zone() if the disk is
used with scsi-mq. As the disk zones_wlock bitmap is not necessry,
do not allocate it.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Bart Van Assche <Bart.VanAssche@wdc.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/scsi/sd_zbc.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index 12d663614099..201b2983d8b8 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -287,6 +287,7 @@ int sd_zbc_setup_reset_cmnd(struct scsi_cmnd *cmd)
 int sd_zbc_write_lock_zone(struct scsi_cmnd *cmd)
 {
 	struct request *rq = cmd->request;
+	struct request_queue *q = rq->q;
 	struct scsi_disk *sdkp = scsi_disk(rq->rq_disk);
 	sector_t sector = blk_rq_pos(rq);
 	sector_t zone_sectors = sd_zbc_zone_sectors(sdkp);
@@ -297,10 +298,14 @@ int sd_zbc_write_lock_zone(struct scsi_cmnd *cmd)
 	 */
 
 	/* Do not allow zone boundaries crossing on host-managed drives */
-	if (blk_queue_zoned_model(rq->q) == BLK_ZONED_HM &&
+	if (blk_queue_zoned_model(q) == BLK_ZONED_HM &&
 	    (sector & (zone_sectors - 1)) + blk_rq_sectors(rq) > zone_sectors)
 		return BLKPREP_KILL;
 
+	/* No write locking with scsi-mq */
+	if (q->mq_ops)
+		return BLKPREP_OK;
+
 	/*
 	 * There is no write constraints on conventional zones. So any write
 	 * command can be sent. But do not issue more than one write command
@@ -717,7 +722,7 @@ static int sd_zbc_setup(struct scsi_disk *sdkp)
 	if (sdkp->first_scan)
 		return 0;
 
-	if (!sdkp->zones_wlock) {
+	if (!q->mq_ops && !sdkp->zones_wlock) {
 		sdkp->zones_wlock = sd_zbc_alloc_zone_bitmap(sdkp);
 		if (!sdkp->zones_wlock)
 			return -ENOMEM;
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH V5 10/14] block: mq-deadline: Add zoned block device data
  2017-09-25  6:14 [PATCH V5 00/14] scsi-mq support for ZBC disks Damien Le Moal
                   ` (8 preceding siblings ...)
  2017-09-25  6:14 ` [PATCH V5 09/14] scsi: sd_zbc: Disable zone write locking with scsi-mq Damien Le Moal
@ 2017-09-25  6:14 ` Damien Le Moal
  2017-09-25 21:34     ` Bart Van Assche
  2017-09-25  6:14 ` [PATCH V5 11/14] blokc: mq-deadline: Introduce dispatch helpers Damien Le Moal
                   ` (3 subsequent siblings)
  13 siblings, 1 reply; 40+ messages in thread
From: Damien Le Moal @ 2017-09-25  6:14 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, linux-block, Jens Axboe
  Cc: Christoph Hellwig, Bart Van Assche

Introduce new fields to mq-deadline private data to support zoned block
devices. The fields are a zone bitmap used to implement zone write
locking and a spinlock to atomically handle zone write locking with
other processing.

Modify mq-dealine init_queue and exit_queue elevator methods to handle
initialization and cleanup of the zone write lock bitmap.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
---
 block/mq-deadline.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 46 insertions(+), 4 deletions(-)

diff --git a/block/mq-deadline.c b/block/mq-deadline.c
index a1cad4331edd..af2eb9b3936e 100644
--- a/block/mq-deadline.c
+++ b/block/mq-deadline.c
@@ -60,6 +60,9 @@ struct deadline_data {
 
 	spinlock_t lock;
 	struct list_head dispatch;
+
+	spinlock_t zone_lock;
+	unsigned long *zones_wlock;
 };
 
 static inline struct rb_root *
@@ -300,6 +303,34 @@ static struct request *dd_dispatch_request(struct blk_mq_hw_ctx *hctx)
 	return rq;
 }
 
+static int deadline_init_zones_wlock(struct request_queue *q,
+				     struct deadline_data *dd)
+{
+	/*
+	 * For regular drives or non-conforming zoned block device,
+	 * do not use zone write locking.
+	 */
+	if (!blk_queue_nr_zones(q))
+		return 0;
+
+	/*
+	 * Treat host aware drives as regular disks.
+	 */
+	if (blk_queue_zoned_model(q) != BLK_ZONED_HM)
+		return 0;
+
+	dd->zones_wlock = kzalloc_node(BITS_TO_LONGS(blk_queue_nr_zones(q))
+				       * sizeof(unsigned long),
+				       GFP_KERNEL, q->node);
+	if (!dd->zones_wlock)
+		return -ENOMEM;
+
+	pr_info("mq-deadline: %s: zones write locking enabled\n",
+		dev_name(q->backing_dev_info->dev));
+
+	return 0;
+}
+
 static void dd_exit_queue(struct elevator_queue *e)
 {
 	struct deadline_data *dd = e->elevator_data;
@@ -307,6 +338,7 @@ static void dd_exit_queue(struct elevator_queue *e)
 	BUG_ON(!list_empty(&dd->fifo_list[READ]));
 	BUG_ON(!list_empty(&dd->fifo_list[WRITE]));
 
+	kfree(dd->zones_wlock);
 	kfree(dd);
 }
 
@@ -317,16 +349,15 @@ static int dd_init_queue(struct request_queue *q, struct elevator_type *e)
 {
 	struct deadline_data *dd;
 	struct elevator_queue *eq;
+	int ret = -ENOMEM;
 
 	eq = elevator_alloc(q, e);
 	if (!eq)
 		return -ENOMEM;
 
 	dd = kzalloc_node(sizeof(*dd), GFP_KERNEL, q->node);
-	if (!dd) {
-		kobject_put(&eq->kobj);
-		return -ENOMEM;
-	}
+	if (!dd)
+		goto out_put_elv;
 	eq->elevator_data = dd;
 
 	INIT_LIST_HEAD(&dd->fifo_list[READ]);
@@ -340,9 +371,20 @@ static int dd_init_queue(struct request_queue *q, struct elevator_type *e)
 	dd->fifo_batch = fifo_batch;
 	spin_lock_init(&dd->lock);
 	INIT_LIST_HEAD(&dd->dispatch);
+	spin_lock_init(&dd->zone_lock);
+
+	ret = deadline_init_zones_wlock(q, dd);
+	if (ret)
+		goto out_free_dd;
 
 	q->elevator = eq;
 	return 0;
+
+out_free_dd:
+	kfree(dd);
+out_put_elv:
+	kobject_put(&eq->kobj);
+	return ret;
 }
 
 static int dd_request_merge(struct request_queue *q, struct request **rq,
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH V5 11/14] blokc: mq-deadline: Introduce dispatch helpers
  2017-09-25  6:14 [PATCH V5 00/14] scsi-mq support for ZBC disks Damien Le Moal
                   ` (9 preceding siblings ...)
  2017-09-25  6:14 ` [PATCH V5 10/14] block: mq-deadline: Add zoned block device data Damien Le Moal
@ 2017-09-25  6:14 ` Damien Le Moal
  2017-09-25 21:44     ` Bart Van Assche
  2017-09-25  6:14 ` [PATCH V5 12/14] block: mq-deadline: Introduce zone locking support Damien Le Moal
                   ` (2 subsequent siblings)
  13 siblings, 1 reply; 40+ messages in thread
From: Damien Le Moal @ 2017-09-25  6:14 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, linux-block, Jens Axboe
  Cc: Christoph Hellwig, Bart Van Assche

Avoid directly referencing the next_rq and fifo_list arrays using the
helper functions deadline_next_request() and deadline_fifo_request() to
facilitate changes in the dispatch request selection in
__dd_dispatch_request().

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
---
 block/mq-deadline.c | 45 +++++++++++++++++++++++++++++++++++++--------
 1 file changed, 37 insertions(+), 8 deletions(-)

diff --git a/block/mq-deadline.c b/block/mq-deadline.c
index af2eb9b3936e..296880e2471f 100644
--- a/block/mq-deadline.c
+++ b/block/mq-deadline.c
@@ -195,13 +195,42 @@ static inline int deadline_check_fifo(struct deadline_data *dd, int ddir)
 }
 
 /*
+ * For the specified data direction, return the next request to
+ * dispatch using arrival ordered lists.
+ */
+static struct request *
+deadline_fifo_request(struct deadline_data *dd, int data_dir)
+{
+	if (WARN_ON_ONCE(data_dir != READ && data_dir != WRITE))
+		return NULL;
+
+	if (list_empty(&dd->fifo_list[data_dir]))
+		return NULL;
+
+	return rq_entry_fifo(dd->fifo_list[data_dir].next);
+}
+
+/*
+ * For the specified data direction, return the next request to
+ * dispatch using sector position sorted lists.
+ */
+static struct request *
+deadline_next_request(struct deadline_data *dd, int data_dir)
+{
+	if (WARN_ON_ONCE(data_dir != READ && data_dir != WRITE))
+		return NULL;
+
+	return dd->next_rq[data_dir];
+}
+
+/*
  * deadline_dispatch_requests selects the best request according to
  * read/write expire, fifo_batch, etc
  */
 static struct request *__dd_dispatch_request(struct blk_mq_hw_ctx *hctx)
 {
 	struct deadline_data *dd = hctx->queue->elevator->elevator_data;
-	struct request *rq;
+	struct request *rq, *next_rq;
 	bool reads, writes;
 	int data_dir;
 
@@ -217,10 +246,9 @@ static struct request *__dd_dispatch_request(struct blk_mq_hw_ctx *hctx)
 	/*
 	 * batches are currently reads XOR writes
 	 */
-	if (dd->next_rq[WRITE])
-		rq = dd->next_rq[WRITE];
-	else
-		rq = dd->next_rq[READ];
+	rq = deadline_next_request(dd, WRITE);
+	if (!rq)
+		rq = deadline_next_request(dd, READ);
 
 	if (rq && dd->batching < dd->fifo_batch)
 		/* we have a next request are still entitled to batch */
@@ -263,19 +291,20 @@ static struct request *__dd_dispatch_request(struct blk_mq_hw_ctx *hctx)
 	/*
 	 * we are not running a batch, find best request for selected data_dir
 	 */
-	if (deadline_check_fifo(dd, data_dir) || !dd->next_rq[data_dir]) {
+	next_rq = deadline_next_request(dd, data_dir);
+	if (deadline_check_fifo(dd, data_dir) || !next_rq) {
 		/*
 		 * A deadline has expired, the last request was in the other
 		 * direction, or we have run out of higher-sectored requests.
 		 * Start again from the request with the earliest expiry time.
 		 */
-		rq = rq_entry_fifo(dd->fifo_list[data_dir].next);
+		rq = deadline_fifo_request(dd, data_dir);
 	} else {
 		/*
 		 * The last req was the same dir and we have a next request in
 		 * sort order. No expired requests so continue on from here.
 		 */
-		rq = dd->next_rq[data_dir];
+		rq = next_rq;
 	}
 
 	dd->batching = 0;
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH V5 12/14] block: mq-deadline: Introduce zone locking support
  2017-09-25  6:14 [PATCH V5 00/14] scsi-mq support for ZBC disks Damien Le Moal
                   ` (10 preceding siblings ...)
  2017-09-25  6:14 ` [PATCH V5 11/14] blokc: mq-deadline: Introduce dispatch helpers Damien Le Moal
@ 2017-09-25  6:14 ` Damien Le Moal
  2017-09-25 22:00     ` Bart Van Assche
  2017-09-25  6:14 ` [PATCH V5 13/14] block: mq-deadline: Limit write request dispatch for zoned block devices Damien Le Moal
  2017-09-25  6:14 ` [PATCH V5 14/14] block: do not set mq default scheduler Damien Le Moal
  13 siblings, 1 reply; 40+ messages in thread
From: Damien Le Moal @ 2017-09-25  6:14 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, linux-block, Jens Axboe
  Cc: Christoph Hellwig, Bart Van Assche

For a write request to a zoned block device, lock the request target
zone upon request displatch. The zone is unlocked either when the
request completes or when the request is requeued (inserted).

To indicate that a request has locked its target zone, use the first
pointer of the request elevator private data to store the value
RQ_ZONE_WLOCKED. Testing for this value allows quick decision in
dd_insert_request() and dd_completed_request() regarding the need for
unlocking the target zone of a request.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
---
 block/mq-deadline.c | 114 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 114 insertions(+)

diff --git a/block/mq-deadline.c b/block/mq-deadline.c
index 296880e2471f..186c32099845 100644
--- a/block/mq-deadline.c
+++ b/block/mq-deadline.c
@@ -178,6 +178,93 @@ deadline_move_request(struct deadline_data *dd, struct request *rq)
 }
 
 /*
+ * Return true if a request is a write requests that needs zone
+ * write locking.
+ */
+static inline bool deadline_request_needs_zone_wlock(struct deadline_data *dd,
+						     struct request *rq)
+{
+
+	if (!dd->zones_wlock)
+		return false;
+
+	if (blk_rq_is_passthrough(rq))
+		return false;
+
+	switch (req_op(rq)) {
+	case REQ_OP_WRITE_ZEROES:
+	case REQ_OP_WRITE_SAME:
+	case REQ_OP_WRITE:
+		return blk_rq_zone_is_seq(rq);
+	default:
+		return false;
+	}
+}
+
+/*
+ * Abuse the elv.priv[0] pointer to indicate if a request has write
+ * locked its target zone. Only write request to a zoned block device
+ * can own a zone write lock.
+ */
+#define RQ_ZONE_WLOCKED		((void *)1UL)
+static inline void deadline_set_request_zone_wlock(struct request *rq)
+{
+	rq->elv.priv[0] = RQ_ZONE_WLOCKED;
+}
+
+#define RQ_ZONE_NO_WLOCK	((void *)0UL)
+static inline void deadline_clear_request_zone_wlock(struct request *rq)
+{
+	rq->elv.priv[0] = RQ_ZONE_NO_WLOCK;
+}
+
+static inline bool deadline_request_has_zone_wlock(struct request *rq)
+{
+	return rq->elv.priv[0] == RQ_ZONE_WLOCKED;
+}
+
+/*
+ * Write lock the target zone of a write request.
+ */
+static void deadline_wlock_zone(struct deadline_data *dd,
+				struct request *rq)
+{
+	unsigned int zno = blk_rq_zone_no(rq);
+
+	WARN_ON_ONCE(deadline_request_has_zone_wlock(rq));
+	WARN_ON_ONCE(test_and_set_bit(zno, dd->zones_wlock));
+	deadline_set_request_zone_wlock(rq);
+}
+
+/*
+ * Write unlock the target zone of a write request.
+ */
+static void deadline_wunlock_zone(struct deadline_data *dd,
+				  struct request *rq)
+{
+	unsigned int zno = blk_rq_zone_no(rq);
+	unsigned long flags;
+
+	spin_lock_irqsave(&dd->zone_lock, flags);
+
+	WARN_ON_ONCE(!test_and_clear_bit(zno, dd->zones_wlock));
+	deadline_clear_request_zone_wlock(rq);
+
+	spin_unlock_irqrestore(&dd->zone_lock, flags);
+}
+
+/*
+ * Test the write lock state of the target zone of a write request.
+ */
+static inline bool deadline_zone_is_wlocked(struct deadline_data *dd,
+					    struct request *rq)
+{
+	unsigned int zno = blk_rq_zone_no(rq);
+
+	return test_bit(zno, dd->zones_wlock);
+}
+
+/*
  * deadline_check_fifo returns 0 if there are no expired requests on the fifo,
  * 1 otherwise. Requires !list_empty(&dd->fifo_list[data_dir])
  */
@@ -316,6 +403,11 @@ static struct request *__dd_dispatch_request(struct blk_mq_hw_ctx *hctx)
 	dd->batching++;
 	deadline_move_request(dd, rq);
 done:
+	/*
+	 * If the request needs its target zone locked, do it.
+	 */
+	if (deadline_request_needs_zone_wlock(dd, rq))
+		deadline_wlock_zone(dd, rq);
 	rq->rq_flags |= RQF_STARTED;
 	return rq;
 }
@@ -466,6 +558,13 @@ static void dd_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq,
 	struct deadline_data *dd = q->elevator->elevator_data;
 	const int data_dir = rq_data_dir(rq);
 
+	/*
+	 * This may be a requeue of a request that has locked its
+	 * target zone. If this is the case, release the request zone lock.
+	 */
+	if (deadline_request_has_zone_wlock(rq))
+		deadline_wunlock_zone(dd, rq);
+
 	if (blk_mq_sched_try_insert_merge(q, rq))
 		return;
 
@@ -510,6 +609,20 @@ static void dd_insert_requests(struct blk_mq_hw_ctx *hctx,
 	spin_unlock(&dd->lock);
 }
 
+/*
+ * For zoned block devices, write unlock the target zone of
+ * completed write requests.
+ */
+static void dd_completed_request(struct request *rq)
+{
+
+	if (deadline_request_has_zone_wlock(rq)) {
+		struct deadline_data *dd = rq->q->elevator->elevator_data;
+
+		deadline_wunlock_zone(dd, rq);
+	}
+}
+
 static bool dd_has_work(struct blk_mq_hw_ctx *hctx)
 {
 	struct deadline_data *dd = hctx->queue->elevator->elevator_data;
@@ -711,6 +824,7 @@ static struct elevator_type mq_deadline = {
 	.ops.mq = {
 		.insert_requests	= dd_insert_requests,
 		.dispatch_request	= dd_dispatch_request,
+		.completed_request	= dd_completed_request,
 		.next_request		= elv_rb_latter_request,
 		.former_request		= elv_rb_former_request,
 		.bio_merge		= dd_bio_merge,
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH V5 13/14] block: mq-deadline: Limit write request dispatch for zoned block devices
  2017-09-25  6:14 [PATCH V5 00/14] scsi-mq support for ZBC disks Damien Le Moal
                   ` (11 preceding siblings ...)
  2017-09-25  6:14 ` [PATCH V5 12/14] block: mq-deadline: Introduce zone locking support Damien Le Moal
@ 2017-09-25  6:14 ` Damien Le Moal
  2017-09-25 22:06     ` Bart Van Assche
  2017-09-25  6:14 ` [PATCH V5 14/14] block: do not set mq default scheduler Damien Le Moal
  13 siblings, 1 reply; 40+ messages in thread
From: Damien Le Moal @ 2017-09-25  6:14 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, linux-block, Jens Axboe
  Cc: Christoph Hellwig, Bart Van Assche

When dispatching writes to a zoned block device, only allow the request
to be dispatched if its target zone is not locked. If it is, leave the
request in the scheduler queue and look for another suitable write
request. If no write can be dispatched, allow reads to be dispatched
even if the write batch is not done.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
---
 block/mq-deadline.c | 62 +++++++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 58 insertions(+), 4 deletions(-)

diff --git a/block/mq-deadline.c b/block/mq-deadline.c
index 186c32099845..fc3e50a0a495 100644
--- a/block/mq-deadline.c
+++ b/block/mq-deadline.c
@@ -282,19 +282,47 @@ static inline int deadline_check_fifo(struct deadline_data *dd, int ddir)
 }
 
 /*
+ * Test if a request can be dispatched.
+ */
+static inline bool deadline_can_dispatch_request(struct deadline_data *dd,
+						 struct request *rq)
+{
+	if (!deadline_request_needs_zone_wlock(dd, rq))
+		return true;
+	return !deadline_zone_is_wlocked(dd, rq);
+}
+
+/*
  * For the specified data direction, return the next request to
  * dispatch using arrival ordered lists.
  */
 static struct request *
 deadline_fifo_request(struct deadline_data *dd, int data_dir)
 {
+	struct request *rq;
+	unsigned long flags;
+
 	if (WARN_ON_ONCE(data_dir != READ && data_dir != WRITE))
 		return NULL;
 
 	if (list_empty(&dd->fifo_list[data_dir]))
 		return NULL;
 
-	return rq_entry_fifo(dd->fifo_list[data_dir].next);
+	if (!dd->zones_wlock || data_dir == READ)
+		return rq_entry_fifo(dd->fifo_list[data_dir].next);
+
+	spin_lock_irqsave(&dd->zone_lock, flags);
+
+	list_for_each_entry(rq, &dd->fifo_list[WRITE], queuelist) {
+		if (deadline_can_dispatch_request(dd, rq))
+			goto out;
+	}
+	rq = NULL;
+
+out:
+	spin_unlock_irqrestore(&dd->zone_lock, flags);
+
+	return rq;
 }
 
 /*
@@ -304,10 +332,25 @@ deadline_fifo_request(struct deadline_data *dd, int data_dir)
 static struct request *
 deadline_next_request(struct deadline_data *dd, int data_dir)
 {
+	struct request *rq;
+	unsigned long flags;
+
 	if (WARN_ON_ONCE(data_dir != READ && data_dir != WRITE))
 		return NULL;
 
-	return dd->next_rq[data_dir];
+	rq = dd->next_rq[data_dir];
+	if (!dd->zones_wlock || data_dir == READ)
+		return rq;
+
+	spin_lock_irqsave(&dd->zone_lock, flags);
+	while (rq) {
+		if (deadline_can_dispatch_request(dd, rq))
+			break;
+		rq = deadline_latter_request(rq);
+	}
+	spin_unlock_irqrestore(&dd->zone_lock, flags);
+
+	return rq;
 }
 
 /*
@@ -349,7 +392,8 @@ static struct request *__dd_dispatch_request(struct blk_mq_hw_ctx *hctx)
 	if (reads) {
 		BUG_ON(RB_EMPTY_ROOT(&dd->sort_list[READ]));
 
-		if (writes && (dd->starved++ >= dd->writes_starved))
+		if (deadline_fifo_request(dd, WRITE) &&
+		    (dd->starved++ >= dd->writes_starved))
 			goto dispatch_writes;
 
 		data_dir = READ;
@@ -394,6 +438,13 @@ static struct request *__dd_dispatch_request(struct blk_mq_hw_ctx *hctx)
 		rq = next_rq;
 	}
 
+	/*
+	 * If we only have writes queued and none of them can be dispatched,
+	 * rq will be NULL.
+	 */
+	if (!rq)
+		return NULL;
+
 	dd->batching = 0;
 
 dispatch_request:
@@ -560,7 +611,7 @@ static void dd_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq,
 
 	/*
 	 * This may be a requeue of a request that has locked its
-	 * target zone. If this is the case, release the request zone lock.
+	 * target zone. If this is the case, release the zone lock.
 	 */
 	if (deadline_request_has_zone_wlock(rq))
 		deadline_wunlock_zone(dd, rq);
@@ -570,6 +621,9 @@ static void dd_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq,
 
 	blk_mq_sched_request_inserted(rq);
 
+	if (at_head && deadline_request_needs_zone_wlock(dd, rq))
+		pr_info("######## Write at head !\n");
+
 	if (at_head || blk_rq_is_passthrough(rq)) {
 		if (at_head)
 			list_add(&rq->queuelist, &dd->dispatch);
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH V5 14/14] block: do not set mq default scheduler
  2017-09-25  6:14 [PATCH V5 00/14] scsi-mq support for ZBC disks Damien Le Moal
                   ` (12 preceding siblings ...)
  2017-09-25  6:14 ` [PATCH V5 13/14] block: mq-deadline: Limit write request dispatch for zoned block devices Damien Le Moal
@ 2017-09-25  6:14 ` Damien Le Moal
  13 siblings, 0 replies; 40+ messages in thread
From: Damien Le Moal @ 2017-09-25  6:14 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, linux-block, Jens Axboe
  Cc: Christoph Hellwig, Bart Van Assche

For blk-mq disks with a single hardware queue, setting by default the
disk scheduler to mq-deadline early during the queue initialization
prevents properly setting zone write locking for host managed zoned
block device as the disk type is not yet known.

Fix this by simply not setting the default scheduler to mq-deadline for
single hardware queue disks. A udev rule can be used to easily do the
same later in the system initialization sequence, when the device
characteristics are known.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
---
 block/elevator.c | 17 ++++++-----------
 1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/block/elevator.c b/block/elevator.c
index 153926a90901..8b65a757f726 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -222,19 +222,14 @@ int elevator_init(struct request_queue *q, char *name)
 
 	if (!e) {
 		/*
-		 * For blk-mq devices, we default to using mq-deadline,
-		 * if available, for single queue devices. If deadline
-		 * isn't available OR we have multiple queues, default
-		 * to "none".
+		 * For blk-mq devices, default to "none". udev can later set
+		 * an appropriate default scheduler based on the disk
+		 * characteristics which we do not yet have here.
 		 */
-		if (q->mq_ops) {
-			if (q->nr_hw_queues == 1)
-				e = elevator_get("mq-deadline", false);
-			if (!e)
-				return 0;
-		} else
-			e = elevator_get(CONFIG_DEFAULT_IOSCHED, false);
+		if (q->mq_ops)
+			return 0;
 
+		e = elevator_get(CONFIG_DEFAULT_IOSCHED, false);
 		if (!e) {
 			printk(KERN_ERR
 				"Default I/O scheduler not found. " \
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH V5 06/14] block: Add zoned block device information to request queue
  2017-09-25  6:14 ` [PATCH V5 06/14] block: Add zoned block device information to request queue Damien Le Moal
@ 2017-09-25 10:05   ` Ming Lei
  2017-09-25 21:06     ` Bart Van Assche
  1 sibling, 0 replies; 40+ messages in thread
From: Ming Lei @ 2017-09-25 10:05 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: linux-scsi, Martin K . Petersen, linux-block, Jens Axboe,
	Christoph Hellwig, Bart Van Assche

On Mon, Sep 25, 2017 at 03:14:46PM +0900, Damien Le Moal wrote:
> Components relying only on the requeuest_queue structure for accessing
> block devices (e.g. I/O schedulers) have a limited knowledged of the
> device characteristics. In particular, the device capacity cannot be
> easily discovered, which for a zoned block device also result in the
> inability to easily know the number of zones of the device (the zone
> size is indicated by the chunk_sectors field of the queue limits).
> 
> Introduce the nr_zones field to the request_queue sturcture to simplify
> access to this information. Also, add the bitmap seq_zone_bitmap which
> indicates which zones of the device are sequential zones (write
> preferred or write required). These two fields are initialized by the
> low level block device driver (sd.c for ZBC/ZAC disks). They are not
> initialized by stacking drivers (device mappers) handling zoned block
> devices (e.g. dm-linear).
> 
> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
> Reviewed-by: Christoph Hellwig <hch@lst.de>

Reviewed-by: Ming Lei <ming.lei@redhat.com>

-- 
Ming

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH V5 03/14] scsi: sd_zbc: Rearrange code
  2017-09-25  6:14 ` [PATCH V5 03/14] scsi: sd_zbc: Rearrange code Damien Le Moal
@ 2017-09-25 21:02     ` Bart Van Assche
  0 siblings, 0 replies; 40+ messages in thread
From: Bart Van Assche @ 2017-09-25 21:02 UTC (permalink / raw)
  To: linux-scsi, linux-block, Damien Le Moal, martin.petersen, axboe; +Cc: hch

T24gTW9uLCAyMDE3LTA5LTI1IGF0IDE1OjE0ICswOTAwLCBEYW1pZW4gTGUgTW9hbCB3cm90ZToN
Cj4gUmVhcnJhbmdlIHNkX3piY19zZXR1cCgpIHRvIGluY2x1ZGUgdXNlXzE2X2Zvcl9ydyBhbmQg
dXNlXzEwX2Zvcl9ydw0KPiBhc3NpZ25tZW50cyBhbmQgbW92ZSB0aGUgY2FsY3VsYXRpb24gb2Yg
c2RrcC0+em9uZV9zaGlmdCB0b2dldGhlcg0KPiB3aXRoIHRoZSBhc3NpZ25tZW50IG9mIHRoZSB2
ZXJpZmllZCB6b25lX2Jsb2NrcyB2YWx1ZSBpbg0KPiBzZF96YmNfY2hlY2tfem9uZV9zaXplKCku
DQo+IA0KPiBObyBmdW5jdGlvbmFsIGNoYW5nZSBpcyBpbnRyb2R1Y2VkIGJ5IHRoaXMgcGF0Y2gu
DQo+IA0KPiBTaWduZWQtb2ZmLWJ5OiBEYW1pZW4gTGUgTW9hbCA8ZGFtaWVuLmxlbW9hbEB3ZGMu
Y29tPg0KPiBSZXZpZXdlZC1ieTogQ2hyaXN0b3BoIEhlbGx3aWcgPGhjaEBsc3QuZGU+DQoNClJl
dmlld2VkLWJ5OiBCYXJ0IFZhbiBBc3NjaGUgPEJhcnQuVmFuQXNzY2hlQHdkYy5jb20+DQoNCg==

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH V5 03/14] scsi: sd_zbc: Rearrange code
@ 2017-09-25 21:02     ` Bart Van Assche
  0 siblings, 0 replies; 40+ messages in thread
From: Bart Van Assche @ 2017-09-25 21:02 UTC (permalink / raw)
  To: linux-scsi, linux-block, Damien Le Moal, martin.petersen, axboe; +Cc: hch

On Mon, 2017-09-25 at 15:14 +0900, Damien Le Moal wrote:
> Rearrange sd_zbc_setup() to include use_16_for_rw and use_10_for_rw
> assignments and move the calculation of sdkp->zone_shift together
> with the assignment of the verified zone_blocks value in
> sd_zbc_check_zone_size().
> 
> No functional change is introduced by this patch.
> 
> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
> Reviewed-by: Christoph Hellwig <hch@lst.de>

Reviewed-by: Bart Van Assche <Bart.VanAssche@wdc.com>


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH V5 06/14] block: Add zoned block device information to request queue
  2017-09-25  6:14 ` [PATCH V5 06/14] block: Add zoned block device information to request queue Damien Le Moal
@ 2017-09-25 21:06     ` Bart Van Assche
  2017-09-25 21:06     ` Bart Van Assche
  1 sibling, 0 replies; 40+ messages in thread
From: Bart Van Assche @ 2017-09-25 21:06 UTC (permalink / raw)
  To: linux-scsi, linux-block, Damien Le Moal, martin.petersen, axboe; +Cc: hch

T24gTW9uLCAyMDE3LTA5LTI1IGF0IDE1OjE0ICswOTAwLCBEYW1pZW4gTGUgTW9hbCB3cm90ZToN
Cj4gQ29tcG9uZW50cyByZWx5aW5nIG9ubHkgb24gdGhlIHJlcXVldWVzdF9xdWV1ZSBzdHJ1Y3R1
cmUgZm9yIGFjY2Vzc2luZw0KPiBibG9jayBkZXZpY2VzIChlLmcuIEkvTyBzY2hlZHVsZXJzKSBo
YXZlIGEgbGltaXRlZCBrbm93bGVkZ2VkIG9mIHRoZQ0KPiBkZXZpY2UgY2hhcmFjdGVyaXN0aWNz
LiBJbiBwYXJ0aWN1bGFyLCB0aGUgZGV2aWNlIGNhcGFjaXR5IGNhbm5vdCBiZQ0KPiBlYXNpbHkg
ZGlzY292ZXJlZCwgd2hpY2ggZm9yIGEgem9uZWQgYmxvY2sgZGV2aWNlIGFsc28gcmVzdWx0IGlu
IHRoZQ0KPiBpbmFiaWxpdHkgdG8gZWFzaWx5IGtub3cgdGhlIG51bWJlciBvZiB6b25lcyBvZiB0
aGUgZGV2aWNlICh0aGUgem9uZQ0KPiBzaXplIGlzIGluZGljYXRlZCBieSB0aGUgY2h1bmtfc2Vj
dG9ycyBmaWVsZCBvZiB0aGUgcXVldWUgbGltaXRzKS4NCj4gDQo+IEludHJvZHVjZSB0aGUgbnJf
em9uZXMgZmllbGQgdG8gdGhlIHJlcXVlc3RfcXVldWUgc3R1cmN0dXJlIHRvIHNpbXBsaWZ5DQo+
IGFjY2VzcyB0byB0aGlzIGluZm9ybWF0aW9uLiBBbHNvLCBhZGQgdGhlIGJpdG1hcCBzZXFfem9u
ZV9iaXRtYXAgd2hpY2gNCj4gaW5kaWNhdGVzIHdoaWNoIHpvbmVzIG9mIHRoZSBkZXZpY2UgYXJl
IHNlcXVlbnRpYWwgem9uZXMgKHdyaXRlDQo+IHByZWZlcnJlZCBvciB3cml0ZSByZXF1aXJlZCku
IFRoZXNlIHR3byBmaWVsZHMgYXJlIGluaXRpYWxpemVkIGJ5IHRoZQ0KPiBsb3cgbGV2ZWwgYmxv
Y2sgZGV2aWNlIGRyaXZlciAoc2QuYyBmb3IgWkJDL1pBQyBkaXNrcykuIFRoZXkgYXJlIG5vdA0K
PiBpbml0aWFsaXplZCBieSBzdGFja2luZyBkcml2ZXJzIChkZXZpY2UgbWFwcGVycykgaGFuZGxp
bmcgem9uZWQgYmxvY2sNCj4gZGV2aWNlcyAoZS5nLiBkbS1saW5lYXIpLg0KDQpSZXZpZXdlZC1i
eTogQmFydCBWYW4gQXNzY2hlIDxCYXJ0LlZhbkFzc2NoZUB3ZGMuY29tPg0KDQo=

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH V5 06/14] block: Add zoned block device information to request queue
@ 2017-09-25 21:06     ` Bart Van Assche
  0 siblings, 0 replies; 40+ messages in thread
From: Bart Van Assche @ 2017-09-25 21:06 UTC (permalink / raw)
  To: linux-scsi, linux-block, Damien Le Moal, martin.petersen, axboe; +Cc: hch

On Mon, 2017-09-25 at 15:14 +0900, Damien Le Moal wrote:
> Components relying only on the requeuest_queue structure for accessing
> block devices (e.g. I/O schedulers) have a limited knowledged of the
> device characteristics. In particular, the device capacity cannot be
> easily discovered, which for a zoned block device also result in the
> inability to easily know the number of zones of the device (the zone
> size is indicated by the chunk_sectors field of the queue limits).
> 
> Introduce the nr_zones field to the request_queue sturcture to simplify
> access to this information. Also, add the bitmap seq_zone_bitmap which
> indicates which zones of the device are sequential zones (write
> preferred or write required). These two fields are initialized by the
> low level block device driver (sd.c for ZBC/ZAC disks). They are not
> initialized by stacking drivers (device mappers) handling zoned block
> devices (e.g. dm-linear).

Reviewed-by: Bart Van Assche <Bart.VanAssche@wdc.com>


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH V5 07/14] scsi: sd_zbc: Initialize device request queue zoned data
  2017-09-25  6:14 ` [PATCH V5 07/14] scsi: sd_zbc: Initialize device request queue zoned data Damien Le Moal
@ 2017-09-25 21:17     ` Bart Van Assche
  0 siblings, 0 replies; 40+ messages in thread
From: Bart Van Assche @ 2017-09-25 21:17 UTC (permalink / raw)
  To: linux-scsi, linux-block, Damien Le Moal, martin.petersen, axboe; +Cc: hch

T24gTW9uLCAyMDE3LTA5LTI1IGF0IDE1OjE0ICswOTAwLCBEYW1pZW4gTGUgTW9hbCB3cm90ZToN
Cj4gKwlyZXR1cm4ga3phbGxvY19ub2RlKEJJVFNfVE9fTE9OR1Moc2RrcC0+bnJfem9uZXMpDQo+
ICsJCQkgICAgKiBzaXplb2YodW5zaWduZWQgbG9uZyksDQoNCkRvZXMgdGhpcyBwZXJoYXBzIGZp
dCBvbiBvbmUgbGluZT8NCg0KPiArLyoqDQo+ICsgKiBzZF96YmNfZ2V0X3NlcV96b25lcyAtIFBh
cnNlIHJlcG9ydCB6b25lcyByZXBseSB0byBpZGVudGlmeSBzZXF1ZW50aWFsIHpvbmVzDQo+ICsg
KiBAc2RrcDogZGlzayB1c2VkDQo+ICsgKiBAYnVmOiByZXBvcnQgcmVwbHkgYnVmZmVyDQo+ICsg
KiBAc2VxX3pvbmVfYml0YW1wOiBiaXRtYXAgb2Ygc2VxdWVudGlhbCB6b25lcyB0byBzZXQNCj4g
KyAqIEB6bm86IFpvbmUgbnVtYmVyIG9mIHRoZSBmaXJzdCB6b25lIGluIHRoZSByZXBvcnQNCg0K
J3pubycgaXMgYW4gaW5wdXQgYW5kIG91dHB1dCBwYXJhbWV0ZXIgYnV0IHRoZSBhYm92ZSBsaW5l
IG9ubHkgZGVzY3JpYmVzIHdoYXQNCmhhcHBlbnMgd2l0aCB0aGUgdmFsdWUgcGFzc2VkIHRvIHNk
X3piY19nZXRfc2VxX3pvbmVzKCkuIEkgdGhpbmsgd2UgYWxzbyBuZWVkIGENCmRlc2NyaXB0aW9u
IGZvciB0aGUgdmFsdWUgb2YgJ3pubycgdXBvbiByZXR1cm4uDQoNCj4gKyAqIFBhcnNlIHJlcG9y
dGVkIHpvbmUgZGVzY3JpcHRvcnMgdG8gZmluZCBzZXF1aWVudGlhbCB6b25lcy4NCiAgICAgICAg
ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICBeXl5eXl5eXl5eXg0KICAgICAg
ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIHNlcXVlbnRpYWw/DQoNCk90
aGVyd2lzZSB0aGlzIHBhdGNoIGxvb2tzIGZpbmUgdG8gbWUuDQoNCkJhcnQu

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH V5 07/14] scsi: sd_zbc: Initialize device request queue zoned data
@ 2017-09-25 21:17     ` Bart Van Assche
  0 siblings, 0 replies; 40+ messages in thread
From: Bart Van Assche @ 2017-09-25 21:17 UTC (permalink / raw)
  To: linux-scsi, linux-block, Damien Le Moal, martin.petersen, axboe; +Cc: hch

On Mon, 2017-09-25 at 15:14 +0900, Damien Le Moal wrote:
> +	return kzalloc_node(BITS_TO_LONGS(sdkp->nr_zones)
> +			    * sizeof(unsigned long),

Does this perhaps fit on one line?

> +/**
> + * sd_zbc_get_seq_zones - Parse report zones reply to identify sequential zones
> + * @sdkp: disk used
> + * @buf: report reply buffer
> + * @seq_zone_bitamp: bitmap of sequential zones to set
> + * @zno: Zone number of the first zone in the report

'zno' is an input and output parameter but the above line only describes what
happens with the value passed to sd_zbc_get_seq_zones(). I think we also need a
description for the value of 'zno' upon return.

> + * Parse reported zone descriptors to find sequiential zones.
                                              ^^^^^^^^^^^
                                              sequential?

Otherwise this patch looks fine to me.

Bart.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH V5 10/14] block: mq-deadline: Add zoned block device data
  2017-09-25  6:14 ` [PATCH V5 10/14] block: mq-deadline: Add zoned block device data Damien Le Moal
@ 2017-09-25 21:34     ` Bart Van Assche
  0 siblings, 0 replies; 40+ messages in thread
From: Bart Van Assche @ 2017-09-25 21:34 UTC (permalink / raw)
  To: linux-scsi, linux-block, Damien Le Moal, martin.petersen, axboe; +Cc: hch

T24gTW9uLCAyMDE3LTA5LTI1IGF0IDE1OjE0ICswOTAwLCBEYW1pZW4gTGUgTW9hbCB3cm90ZToN
Cj4gTW9kaWZ5IG1xLWRlYWxpbmUgaW5pdF9xdWV1ZSBhbmQgZXhpdF9xdWV1ZSBlbGV2YXRvciBt
ZXRob2RzIHRvIGhhbmRsZQ0KICAgICAgICAgXl5eXl5eXl5eXg0KICAgICAgICAgbXEtZGVhZGxp
bmUgPw0KDQo+ICtzdGF0aWMgaW50IGRlYWRsaW5lX2luaXRfem9uZXNfd2xvY2soc3RydWN0IHJl
cXVlc3RfcXVldWUgKnEsDQo+ICsJCQkJICAgICBzdHJ1Y3QgZGVhZGxpbmVfZGF0YSAqZGQpDQo+
ICt7DQo+ICsJLyoNCj4gKwkgKiBGb3IgcmVndWxhciBkcml2ZXMgb3Igbm9uLWNvbmZvcm1pbmcg
em9uZWQgYmxvY2sgZGV2aWNlLA0KPiArCSAqIGRvIG5vdCB1c2Ugem9uZSB3cml0ZSBsb2NraW5n
Lg0KPiArCSAqLw0KPiArCWlmICghYmxrX3F1ZXVlX25yX3pvbmVzKHEpKQ0KPiArCQlyZXR1cm4g
MDsNCj4gKw0KPiArCS8qDQo+ICsJICogVHJlYXQgaG9zdCBhd2FyZSBkcml2ZXMgYXMgcmVndWxh
ciBkaXNrcy4NCj4gKwkgKi8NCj4gKwlpZiAoYmxrX3F1ZXVlX3pvbmVkX21vZGVsKHEpICE9IEJM
S19aT05FRF9ITSkNCj4gKwkJcmV0dXJuIDA7DQo+ICsNCj4gKwlkZC0+em9uZXNfd2xvY2sgPSBr
emFsbG9jX25vZGUoQklUU19UT19MT05HUyhibGtfcXVldWVfbnJfem9uZXMocSkpDQo+ICsJCQkJ
ICAgICAgICogc2l6ZW9mKHVuc2lnbmVkIGxvbmcpLA0KPiArCQkJCSAgICAgICBHRlBfS0VSTkVM
LCBxLT5ub2RlKTsNCg0KQSByZXF1ZXN0IHF1ZXVlIGlzIGNyZWF0ZWQgYmVmb3JlIGRpc2sgdmFs
aWRhdGlvbiBvY2N1cnMgYW5kIGJlZm9yZSB0aGUNCm51bWJlciBvZiB6b25lcyBpcyBpbml0aWFs
aXplZCAoc2RfcHJvYmVfYXN5bmMoKSkuIElmIGEgc2NoZWR1bGVyIGlzDQphc3NpZ25lZCB0byBh
IFpCQyBkcml2ZSB0aHJvdWdoIGEgdWRldiBydWxlLCBjYW4gaXQgaGFwcGVuIHRoYXQNCmRlYWRs
aW5lX2luaXRfem9uZXNfd2xvY2soKSBpcyBjYWxsZWQgYmVmb3JlIHRoZSBudW1iZXIgb2Ygem9u
ZXMgaGFzIGJlZW4NCmluaXRpYWxpemVkPw0KDQpCYXJ0Lg==

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH V5 10/14] block: mq-deadline: Add zoned block device data
@ 2017-09-25 21:34     ` Bart Van Assche
  0 siblings, 0 replies; 40+ messages in thread
From: Bart Van Assche @ 2017-09-25 21:34 UTC (permalink / raw)
  To: linux-scsi, linux-block, Damien Le Moal, martin.petersen, axboe; +Cc: hch

On Mon, 2017-09-25 at 15:14 +0900, Damien Le Moal wrote:
> Modify mq-dealine init_queue and exit_queue elevator methods to handle
         ^^^^^^^^^^
         mq-deadline ?

> +static int deadline_init_zones_wlock(struct request_queue *q,
> +				     struct deadline_data *dd)
> +{
> +	/*
> +	 * For regular drives or non-conforming zoned block device,
> +	 * do not use zone write locking.
> +	 */
> +	if (!blk_queue_nr_zones(q))
> +		return 0;
> +
> +	/*
> +	 * Treat host aware drives as regular disks.
> +	 */
> +	if (blk_queue_zoned_model(q) != BLK_ZONED_HM)
> +		return 0;
> +
> +	dd->zones_wlock = kzalloc_node(BITS_TO_LONGS(blk_queue_nr_zones(q))
> +				       * sizeof(unsigned long),
> +				       GFP_KERNEL, q->node);

A request queue is created before disk validation occurs and before the
number of zones is initialized (sd_probe_async()). If a scheduler is
assigned to a ZBC drive through a udev rule, can it happen that
deadline_init_zones_wlock() is called before the number of zones has been
initialized?

Bart.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH V5 11/14] blokc: mq-deadline: Introduce dispatch helpers
  2017-09-25  6:14 ` [PATCH V5 11/14] blokc: mq-deadline: Introduce dispatch helpers Damien Le Moal
@ 2017-09-25 21:44     ` Bart Van Assche
  0 siblings, 0 replies; 40+ messages in thread
From: Bart Van Assche @ 2017-09-25 21:44 UTC (permalink / raw)
  To: linux-scsi, linux-block, Damien Le Moal, martin.petersen, axboe; +Cc: hch

T24gTW9uLCAyMDE3LTA5LTI1IGF0IDE1OjE0ICswOTAwLCBEYW1pZW4gTGUgTW9hbCB3cm90ZToN
Cj4gQXZvaWQgZGlyZWN0bHkgcmVmZXJlbmNpbmcgdGhlIG5leHRfcnEgYW5kIGZpZm9fbGlzdCBh
cnJheXMgdXNpbmcgdGhlDQo+IGhlbHBlciBmdW5jdGlvbnMgZGVhZGxpbmVfbmV4dF9yZXF1ZXN0
KCkgYW5kIGRlYWRsaW5lX2ZpZm9fcmVxdWVzdCgpIHRvDQo+IGZhY2lsaXRhdGUgY2hhbmdlcyBp
biB0aGUgZGlzcGF0Y2ggcmVxdWVzdCBzZWxlY3Rpb24gaW4NCj4gX19kZF9kaXNwYXRjaF9yZXF1
ZXN0KCkuDQoNClJldmlld2VkLWJ5OiBCYXJ0IFZhbiBBc3NjaGUgPEJhcnQuVmFuQXNzY2hlQHdk
Yy5jb20+DQo=

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH V5 11/14] blokc: mq-deadline: Introduce dispatch helpers
@ 2017-09-25 21:44     ` Bart Van Assche
  0 siblings, 0 replies; 40+ messages in thread
From: Bart Van Assche @ 2017-09-25 21:44 UTC (permalink / raw)
  To: linux-scsi, linux-block, Damien Le Moal, martin.petersen, axboe; +Cc: hch

On Mon, 2017-09-25 at 15:14 +0900, Damien Le Moal wrote:
> Avoid directly referencing the next_rq and fifo_list arrays using the
> helper functions deadline_next_request() and deadline_fifo_request() to
> facilitate changes in the dispatch request selection in
> __dd_dispatch_request().

Reviewed-by: Bart Van Assche <Bart.VanAssche@wdc.com>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH V5 12/14] block: mq-deadline: Introduce zone locking support
  2017-09-25  6:14 ` [PATCH V5 12/14] block: mq-deadline: Introduce zone locking support Damien Le Moal
@ 2017-09-25 22:00     ` Bart Van Assche
  0 siblings, 0 replies; 40+ messages in thread
From: Bart Van Assche @ 2017-09-25 22:00 UTC (permalink / raw)
  To: linux-scsi, linux-block, Damien Le Moal, martin.petersen, axboe; +Cc: hch

T24gTW9uLCAyMDE3LTA5LTI1IGF0IDE1OjE0ICswOTAwLCBEYW1pZW4gTGUgTW9hbCB3cm90ZToN
Cj4gK3N0YXRpYyBpbmxpbmUgYm9vbCBkZWFkbGluZV9yZXF1ZXN0X25lZWRzX3pvbmVfd2xvY2so
c3RydWN0IGRlYWRsaW5lX2RhdGEgKmRkLA0KPiArCQkJCQkJICAgICBzdHJ1Y3QgcmVxdWVzdCAq
cnEpDQo+ICt7DQo+ICsNCj4gKwlpZiAoIWRkLT56b25lc193bG9jaykNCj4gKwkJcmV0dXJuIGZh
bHNlOw0KPiArDQo+ICsJaWYgKGJsa19ycV9pc19wYXNzdGhyb3VnaChycSkpDQo+ICsJCXJldHVy
biBmYWxzZTsNCj4gKw0KPiArCXN3aXRjaCAocmVxX29wKHJxKSkgew0KPiArCWNhc2UgUkVRX09Q
X1dSSVRFX1pFUk9FUzoNCj4gKwljYXNlIFJFUV9PUF9XUklURV9TQU1FOg0KPiArCWNhc2UgUkVR
X09QX1dSSVRFOg0KPiArCQlyZXR1cm4gYmxrX3JxX3pvbmVfaXNfc2VxKHJxKTsNCj4gKwlkZWZh
dWx0Og0KPiArCQlyZXR1cm4gZmFsc2U7DQo+ICsJfQ0KDQpJZiBhbnlvbmUgZXZlciBhZGRzIGEg
bmV3IHdyaXRlIHJlcXVlc3QgdHlwZSBpdCB3aWxsIGJlIGVhc3kgdG8gb3Zlcmxvb2sgdGhpcw0K
ZnVuY3Rpb24uIFNob3VsZCB0aGUgJ2RlZmF1bHQnIGNhc2UgYmUgbGVmdCBvdXQgYW5kIHNob3Vs
ZCBhbGwgcmVxdWVzdCB0eXBlcw0KYmUgbWVudGlvbmVkIGluIHRoZSBzd2l0Y2gvY2FzZSBzdGF0
ZW1lbnQgc3VjaCB0aGF0IHRoZSBjb21waWxlciB3aWxsIGlzc3VlIGENCndhcm5pbmcgaWYgYSBu
ZXcgcmVxdWVzdCBvcGVyYXRpb24gdHlwZSBpcyBhZGRlZCB0byBlbnVtIHJlcV9vcGY/DQoNCj4g
Ky8qDQo+ICsgKiBBYnVzZSB0aGUgZWx2LnByaXZbMF0gcG9pbnRlciB0byBpbmRpY2F0ZSBpZiBh
IHJlcXVlc3QgaGFzIHdyaXRlDQo+ICsgKiBsb2NrZWQgaXRzIHRhcmdldCB6b25lLiBPbmx5IHdy
aXRlIHJlcXVlc3QgdG8gYSB6b25lZCBibG9jayBkZXZpY2UNCj4gKyAqIGNhbiBvd24gYSB6b25l
IHdyaXRlIGxvY2suDQo+ICsgKi8NCj4gKyNkZWZpbmUgUlFfWk9ORV9XTE9DS0VECQkoKHZvaWQg
KikxVUwpDQo+ICtzdGF0aWMgaW5saW5lIHZvaWQgZGVhZGxpbmVfc2V0X3JlcXVlc3Rfem9uZV93
bG9jayhzdHJ1Y3QgcmVxdWVzdCAqcnEpDQo+ICt7DQo+ICsJcnEtPmVsdi5wcml2WzBdID0gUlFf
Wk9ORV9XTE9DS0VEOw0KPiArfQ0KPiArDQo+ICsjZGVmaW5lIFJRX1pPTkVfTk9fV0xPQ0sJKCh2
b2lkICopMFVMKQ0KPiArc3RhdGljIGlubGluZSB2b2lkIGRlYWRsaW5lX2NsZWFyX3JlcXVlc3Rf
em9uZV93bG9jayhzdHJ1Y3QgcmVxdWVzdCAqcnEpDQo+ICt7DQo+ICsJcnEtPmVsdi5wcml2WzBd
ID0gUlFfWk9ORV9OT19XTE9DSzsNCj4gK30NCg0KU2hvdWxkIGFuIGVudW1lcmF0aW9uIHR5cGUg
YmUgaW50cm9kdWNlZCBmb3IgUlFfWk9ORV9XTE9DS0VEIGFuZCBSUV9aT05FX05PX1dMT0NLPw0K
DQo+ICsvKg0KPiArICogV3JpdGUgbG9jayB0aGUgdGFyZ2V0IHpvbmUgb2YgYSB3cml0ZSByZXF1
ZXN0Lg0KPiArICovDQo+ICtzdGF0aWMgdm9pZCBkZWFkbGluZV93bG9ja196b25lKHN0cnVjdCBk
ZWFkbGluZV9kYXRhICpkZCwNCj4gKwkJCQlzdHJ1Y3QgcmVxdWVzdCAqcnEpDQo+ICt7DQo+ICsJ
dW5zaWduZWQgaW50IHpubyA9IGJsa19ycV96b25lX25vKHJxKTsNCj4gKw0KPiArCVdBUk5fT05f
T05DRShkZWFkbGluZV9yZXF1ZXN0X2hhc196b25lX3dsb2NrKHJxKSk7DQo+ICsJV0FSTl9PTl9P
TkNFKHRlc3RfYW5kX3NldF9iaXQoem5vLCBkZC0+em9uZXNfd2xvY2spKTsNCj4gKwlkZWFkbGlu
ZV9zZXRfcmVxdWVzdF96b25lX3dsb2NrKHJxKTsNCj4gK30NCj4gKw0KPiArLyoNCj4gKyAqIFdy
aXRlIHVubG9jayB0aGUgdGFyZ2V0IHpvbmUgb2YgYSB3cml0ZSByZXF1ZXN0Lg0KPiArICovDQo+
ICtzdGF0aWMgdm9pZCBkZWFkbGluZV93dW5sb2NrX3pvbmUoc3RydWN0IGRlYWRsaW5lX2RhdGEg
KmRkLA0KPiArCQkJCSAgc3RydWN0IHJlcXVlc3QgKnJxKQ0KPiArew0KPiArCXVuc2lnbmVkIGlu
dCB6bm8gPSBibGtfcnFfem9uZV9ubyhycSk7DQo+ICsJdW5zaWduZWQgbG9uZyBmbGFnczsNCj4g
Kw0KPiArCXNwaW5fbG9ja19pcnFzYXZlKCZkZC0+em9uZV9sb2NrLCBmbGFncyk7DQo+ICsNCj4g
KwlXQVJOX09OX09OQ0UoIXRlc3RfYW5kX2NsZWFyX2JpdCh6bm8sIGRkLT56b25lc193bG9jaykp
Ow0KPiArCWRlYWRsaW5lX2NsZWFyX3JlcXVlc3Rfem9uZV93bG9jayhycSk7DQo+ICsNCj4gKwlz
cGluX3VubG9ja19pcnFyZXN0b3JlKCZkZC0+em9uZV9sb2NrLCBmbGFncyk7DQo+ICt9DQoNCldo
eSBkb2VzIGRlYWRsaW5lX3d1bmxvY2tfem9uZSgpIHByb3RlY3QgbW9kaWZpY2F0aW9ucyB3aXRo
IGRkLT56b25lX2xvY2sgYnV0DQpkZWFkbGluZV93bG9ja196b25lKCkgbm90PyBJZiB0aGlzIGNv
ZGUgaXMgY29ycmVjdCwgcGxlYXNlIGFkZCBhDQpsb2NrZGVwX2Fzc2VydF9oZWxkKCkgc3RhdGVt
ZW50IGluIHRoZSBmaXJzdCBmdW5jdGlvbi4NCg0KPiArLyoNCj4gKyAqIFRlc3QgdGhlIHdyaXRl
IGxvY2sgc3RhdGUgb2YgdGhlIHRhcmdldCB6b25lIG9mIGEgd3JpdGUgcmVxdWVzdC4NCj4gKyAq
Lw0KPiArc3RhdGljIGlubGluZSBib29sIGRlYWRsaW5lX3pvbmVfaXNfd2xvY2tlZChzdHJ1Y3Qg
ZGVhZGxpbmVfZGF0YSAqZGQsDQo+ICsJCQkJCSAgICBzdHJ1Y3QgcmVxdWVzdCAqcnEpDQo+ICt7
DQo+ICsJdW5zaWduZWQgaW50IHpubyA9IGJsa19ycV96b25lX25vKHJxKTsNCj4gKw0KPiArCXJl
dHVybiB0ZXN0X2JpdCh6bm8sIGRkLT56b25lc193bG9jayk7DQo+ICt9DQoNCkRvIHdlIHJlYWxs
eSBuZWVkIHRoZSBsb2NhbCB2YXJpYWJsZSAnem5vJz8NCg0KPiArLyoNCj4gKyAqIEZvciB6b25l
ZCBibG9jayBkZXZpY2VzLCB3cml0ZSB1bmxvY2sgdGhlIHRhcmdldCB6b25lIG9mDQo+ICsgKiBj
b21wbGV0ZWQgd3JpdGUgcmVxdWVzdHMuDQo+ICsgKi8NCj4gK3N0YXRpYyB2b2lkIGRkX2NvbXBs
ZXRlZF9yZXF1ZXN0KHN0cnVjdCByZXF1ZXN0ICpycSkNCj4gK3sNCj4gKw0KDQpQbGVhc2UgbGVh
dmUgb3V0IHRoZSBibGFuayBsaW5lIGF0IHRoZSBzdGFydCBvZiB0aGlzIGZ1bmN0aW9uLg0KDQpU
aGFua3MsDQoNCkJhcnQu

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH V5 12/14] block: mq-deadline: Introduce zone locking support
@ 2017-09-25 22:00     ` Bart Van Assche
  0 siblings, 0 replies; 40+ messages in thread
From: Bart Van Assche @ 2017-09-25 22:00 UTC (permalink / raw)
  To: linux-scsi, linux-block, Damien Le Moal, martin.petersen, axboe; +Cc: hch

On Mon, 2017-09-25 at 15:14 +0900, Damien Le Moal wrote:
> +static inline bool deadline_request_needs_zone_wlock(struct deadline_data *dd,
> +						     struct request *rq)
> +{
> +
> +	if (!dd->zones_wlock)
> +		return false;
> +
> +	if (blk_rq_is_passthrough(rq))
> +		return false;
> +
> +	switch (req_op(rq)) {
> +	case REQ_OP_WRITE_ZEROES:
> +	case REQ_OP_WRITE_SAME:
> +	case REQ_OP_WRITE:
> +		return blk_rq_zone_is_seq(rq);
> +	default:
> +		return false;
> +	}

If anyone ever adds a new write request type it will be easy to overlook this
function. Should the 'default' case be left out and should all request types
be mentioned in the switch/case statement such that the compiler will issue a
warning if a new request operation type is added to enum req_opf?

> +/*
> + * Abuse the elv.priv[0] pointer to indicate if a request has write
> + * locked its target zone. Only write request to a zoned block device
> + * can own a zone write lock.
> + */
> +#define RQ_ZONE_WLOCKED		((void *)1UL)
> +static inline void deadline_set_request_zone_wlock(struct request *rq)
> +{
> +	rq->elv.priv[0] = RQ_ZONE_WLOCKED;
> +}
> +
> +#define RQ_ZONE_NO_WLOCK	((void *)0UL)
> +static inline void deadline_clear_request_zone_wlock(struct request *rq)
> +{
> +	rq->elv.priv[0] = RQ_ZONE_NO_WLOCK;
> +}

Should an enumeration type be introduced for RQ_ZONE_WLOCKED and RQ_ZONE_NO_WLOCK?

> +/*
> + * Write lock the target zone of a write request.
> + */
> +static void deadline_wlock_zone(struct deadline_data *dd,
> +				struct request *rq)
> +{
> +	unsigned int zno = blk_rq_zone_no(rq);
> +
> +	WARN_ON_ONCE(deadline_request_has_zone_wlock(rq));
> +	WARN_ON_ONCE(test_and_set_bit(zno, dd->zones_wlock));
> +	deadline_set_request_zone_wlock(rq);
> +}
> +
> +/*
> + * Write unlock the target zone of a write request.
> + */
> +static void deadline_wunlock_zone(struct deadline_data *dd,
> +				  struct request *rq)
> +{
> +	unsigned int zno = blk_rq_zone_no(rq);
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&dd->zone_lock, flags);
> +
> +	WARN_ON_ONCE(!test_and_clear_bit(zno, dd->zones_wlock));
> +	deadline_clear_request_zone_wlock(rq);
> +
> +	spin_unlock_irqrestore(&dd->zone_lock, flags);
> +}

Why does deadline_wunlock_zone() protect modifications with dd->zone_lock but
deadline_wlock_zone() not? If this code is correct, please add a
lockdep_assert_held() statement in the first function.

> +/*
> + * Test the write lock state of the target zone of a write request.
> + */
> +static inline bool deadline_zone_is_wlocked(struct deadline_data *dd,
> +					    struct request *rq)
> +{
> +	unsigned int zno = blk_rq_zone_no(rq);
> +
> +	return test_bit(zno, dd->zones_wlock);
> +}

Do we really need the local variable 'zno'?

> +/*
> + * For zoned block devices, write unlock the target zone of
> + * completed write requests.
> + */
> +static void dd_completed_request(struct request *rq)
> +{
> +

Please leave out the blank line at the start of this function.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH V5 13/14] block: mq-deadline: Limit write request dispatch for zoned block devices
  2017-09-25  6:14 ` [PATCH V5 13/14] block: mq-deadline: Limit write request dispatch for zoned block devices Damien Le Moal
@ 2017-09-25 22:06     ` Bart Van Assche
  0 siblings, 0 replies; 40+ messages in thread
From: Bart Van Assche @ 2017-09-25 22:06 UTC (permalink / raw)
  To: linux-scsi, linux-block, Damien Le Moal, martin.petersen, axboe; +Cc: hch

T24gTW9uLCAyMDE3LTA5LTI1IGF0IDE1OjE0ICswOTAwLCBEYW1pZW4gTGUgTW9hbCB3cm90ZToN
Cj4gLQlyZXR1cm4gcnFfZW50cnlfZmlmbyhkZC0+Zmlmb19saXN0W2RhdGFfZGlyXS5uZXh0KTsN
Cj4gKwlpZiAoIWRkLT56b25lc193bG9jayB8fCBkYXRhX2RpciA9PSBSRUFEKQ0KPiArCQlyZXR1
cm4gcnFfZW50cnlfZmlmbyhkZC0+Zmlmb19saXN0W2RhdGFfZGlyXS5uZXh0KTsNCj4gKw0KPiAr
CXNwaW5fbG9ja19pcnFzYXZlKCZkZC0+em9uZV9sb2NrLCBmbGFncyk7DQo+ICsNCj4gKwlsaXN0
X2Zvcl9lYWNoX2VudHJ5KHJxLCAmZGQtPmZpZm9fbGlzdFtXUklURV0sIHF1ZXVlbGlzdCkgew0K
PiArCQlpZiAoZGVhZGxpbmVfY2FuX2Rpc3BhdGNoX3JlcXVlc3QoZGQsIHJxKSkNCj4gKwkJCWdv
dG8gb3V0Ow0KPiArCX0NCj4gKwlycSA9IE5VTEw7DQo+ICsNCj4gK291dDoNCj4gKwlzcGluX3Vu
bG9ja19pcnFyZXN0b3JlKCZkZC0+em9uZV9sb2NrLCBmbGFncyk7DQoNCklzIGl0IGRvY3VtZW50
ZWQgc29tZXdoZXJlIHdoYXQgZGQtPnpvbmVfbG9jayBwcm90ZWN0cyBhbmQgd2hlbiB0aGF0IGxv
Y2sgc2hvdWxkIGJlDQphY3F1aXJlZD8NCg0KPiAJLyoNCj4gIAkgKiBUaGlzIG1heSBiZSBhIHJl
cXVldWUgb2YgYSByZXF1ZXN0IHRoYXQgaGFzIGxvY2tlZCBpdHMNCj4gLQkgKiB0YXJnZXQgem9u
ZS4gSWYgdGhpcyBpcyB0aGUgY2FzZSwgcmVsZWFzZSB0aGUgcmVxdWVzdCB6b25lIGxvY2suDQo+
ICsJICogdGFyZ2V0IHpvbmUuIElmIHRoaXMgaXMgdGhlIGNhc2UsIHJlbGVhc2UgdGhlIHpvbmUg
bG9jay4NCj4gIAkgKi8NCj4gIAlpZiAoZGVhZGxpbmVfcmVxdWVzdF9oYXNfem9uZV93bG9jayhy
cSkpDQo+ICAJCWRlYWRsaW5lX3d1bmxvY2tfem9uZShkZCwgcnEpOw0KDQpDYW4gdGhpcyBjaGFu
Z2UgYmUgZm9sZGVkIGludG8gdGhlIHBhdGNoIHRoYXQgaW50cm9kdWNlZCB0aGF0IGNvbW1lbnQ/
DQoNCj4gQEAgLTU3MCw2ICs2MjEsOSBAQCBzdGF0aWMgdm9pZCBkZF9pbnNlcnRfcmVxdWVzdChz
dHJ1Y3QgYmxrX21xX2h3X2N0eCAqaGN0eCwgc3RydWN0IHJlcXVlc3QgKnJxLA0KPiAgDQo+ICAJ
YmxrX21xX3NjaGVkX3JlcXVlc3RfaW5zZXJ0ZWQocnEpOw0KPiAgDQo+ICsJaWYgKGF0X2hlYWQg
JiYgZGVhZGxpbmVfcmVxdWVzdF9uZWVkc196b25lX3dsb2NrKGRkLCBycSkpDQo+ICsJCXByX2lu
Zm8oIiMjIyMjIyMjIFdyaXRlIGF0IGhlYWQgIVxuIik7DQo+ICsNCj4gIAlpZiAoYXRfaGVhZCB8
fCBibGtfcnFfaXNfcGFzc3Rocm91Z2gocnEpKSB7DQo+ICAJCWlmIChhdF9oZWFkKQ0KPiAgCQkJ
bGlzdF9hZGQoJnJxLT5xdWV1ZWxpc3QsICZkZC0+ZGlzcGF0Y2gpOw0KDQpXaWxsIGl0IGJlIGVh
c3kgdG8gdXNlcnMgd2hvIGFuYWx5emUgYSBrZXJuZWwgbG9nIHRvIGZpZ3VyZSBvdXQgd2h5IHRo
YXQNCm1lc3NhZ2UgaGFzIGJlZW4gZ2VuZXJhdGVkPyBTaG91bGQgdGhhdCBtZXNzYWdlIHBlcmhh
cHMgaW5jbHVkZSB0aGUgYmxvY2sNCmRldmljZSBuYW1lLCB6b25lIG51bWJlciBhbmQgcmVxdWVz
dCBzZWN0b3IgbnVtYmVyPw0KDQpUaGFua3MsDQoNCkJhcnQu

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH V5 13/14] block: mq-deadline: Limit write request dispatch for zoned block devices
@ 2017-09-25 22:06     ` Bart Van Assche
  0 siblings, 0 replies; 40+ messages in thread
From: Bart Van Assche @ 2017-09-25 22:06 UTC (permalink / raw)
  To: linux-scsi, linux-block, Damien Le Moal, martin.petersen, axboe; +Cc: hch

On Mon, 2017-09-25 at 15:14 +0900, Damien Le Moal wrote:
> -	return rq_entry_fifo(dd->fifo_list[data_dir].next);
> +	if (!dd->zones_wlock || data_dir == READ)
> +		return rq_entry_fifo(dd->fifo_list[data_dir].next);
> +
> +	spin_lock_irqsave(&dd->zone_lock, flags);
> +
> +	list_for_each_entry(rq, &dd->fifo_list[WRITE], queuelist) {
> +		if (deadline_can_dispatch_request(dd, rq))
> +			goto out;
> +	}
> +	rq = NULL;
> +
> +out:
> +	spin_unlock_irqrestore(&dd->zone_lock, flags);

Is it documented somewhere what dd->zone_lock protects and when that lock should be
acquired?

> 	/*
>  	 * This may be a requeue of a request that has locked its
> -	 * target zone. If this is the case, release the request zone lock.
> +	 * target zone. If this is the case, release the zone lock.
>  	 */
>  	if (deadline_request_has_zone_wlock(rq))
>  		deadline_wunlock_zone(dd, rq);

Can this change be folded into the patch that introduced that comment?

> @@ -570,6 +621,9 @@ static void dd_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq,
>  
>  	blk_mq_sched_request_inserted(rq);
>  
> +	if (at_head && deadline_request_needs_zone_wlock(dd, rq))
> +		pr_info("######## Write at head !\n");
> +
>  	if (at_head || blk_rq_is_passthrough(rq)) {
>  		if (at_head)
>  			list_add(&rq->queuelist, &dd->dispatch);

Will it be easy to users who analyze a kernel log to figure out why that
message has been generated? Should that message perhaps include the block
device name, zone number and request sector number?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH V5 07/14] scsi: sd_zbc: Initialize device request queue zoned data
  2017-09-25 21:17     ` Bart Van Assche
@ 2017-10-02  4:29       ` Damien Le Moal
  -1 siblings, 0 replies; 40+ messages in thread
From: Damien Le Moal @ 2017-10-02  4:29 UTC (permalink / raw)
  To: linux-scsi, linux-block, Bart Van Assche, martin.petersen, axboe; +Cc: hch

T24gTW9uLCAyMDE3LTA5LTI1IGF0IDIxOjE3ICswMDAwLCBCYXJ0IFZhbiBBc3NjaGUgd3JvdGU6
DQo+IE9uIE1vbiwgMjAxNy0wOS0yNSBhdCAxNToxNCArMDkwMCwgRGFtaWVuIExlIE1vYWwgd3Jv
dGU6DQo+ID4gKwlyZXR1cm4ga3phbGxvY19ub2RlKEJJVFNfVE9fTE9OR1Moc2RrcC0+bnJfem9u
ZXMpDQo+ID4gKwkJCSAgICAqIHNpemVvZih1bnNpZ25lZCBsb25nKSwNCj4gDQo+IERvZXMgdGhp
cyBwZXJoYXBzIGZpdCBvbiBvbmUgbGluZT8NCj4gDQo+ID4gKy8qKg0KPiA+ICsgKiBzZF96YmNf
Z2V0X3NlcV96b25lcyAtIFBhcnNlIHJlcG9ydCB6b25lcyByZXBseSB0byBpZGVudGlmeSBzZXF1
ZW50aWFsDQo+ID4gem9uZXMNCj4gPiArICogQHNka3A6IGRpc2sgdXNlZA0KPiA+ICsgKiBAYnVm
OiByZXBvcnQgcmVwbHkgYnVmZmVyDQo+ID4gKyAqIEBzZXFfem9uZV9iaXRhbXA6IGJpdG1hcCBv
ZiBzZXF1ZW50aWFsIHpvbmVzIHRvIHNldA0KPiA+ICsgKiBAem5vOiBab25lIG51bWJlciBvZiB0
aGUgZmlyc3Qgem9uZSBpbiB0aGUgcmVwb3J0DQo+IA0KPiAnem5vJyBpcyBhbiBpbnB1dCBhbmQg
b3V0cHV0IHBhcmFtZXRlciBidXQgdGhlIGFib3ZlIGxpbmUgb25seSBkZXNjcmliZXMNCj4gd2hh
dA0KPiBoYXBwZW5zIHdpdGggdGhlIHZhbHVlIHBhc3NlZCB0byBzZF96YmNfZ2V0X3NlcV96b25l
cygpLiBJIHRoaW5rIHdlIGFsc28NCj4gbmVlZCBhDQo+IGRlc2NyaXB0aW9uIGZvciB0aGUgdmFs
dWUgb2YgJ3pubycgdXBvbiByZXR1cm4uDQoNCnpubyBpcyBub3QgbmVjZXNzYXJ5IHNpbmNlIGl0
IGNhbiBiZSBjYWxjdWxhdGVkIGZyb20gdGhlIHJlcG9ydGVkIHpvbmUgc3RhcnQNCkxCQS4gU28g
SSByZW1vdmVkIGl0IGluIFY2IHRvIHNpbXBsaWZ5Lg0KDQotLSANCkRhbWllbiBMZSBNb2FsDQpX
ZXN0ZXJuIERpZ2l0YWw=

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH V5 07/14] scsi: sd_zbc: Initialize device request queue zoned data
@ 2017-10-02  4:29       ` Damien Le Moal
  0 siblings, 0 replies; 40+ messages in thread
From: Damien Le Moal @ 2017-10-02  4:29 UTC (permalink / raw)
  To: linux-scsi, linux-block, Bart Van Assche, martin.petersen, axboe; +Cc: hch

On Mon, 2017-09-25 at 21:17 +0000, Bart Van Assche wrote:
> On Mon, 2017-09-25 at 15:14 +0900, Damien Le Moal wrote:
> > +	return kzalloc_node(BITS_TO_LONGS(sdkp->nr_zones)
> > +			    * sizeof(unsigned long),
> 
> Does this perhaps fit on one line?
> 
> > +/**
> > + * sd_zbc_get_seq_zones - Parse report zones reply to identify sequential
> > zones
> > + * @sdkp: disk used
> > + * @buf: report reply buffer
> > + * @seq_zone_bitamp: bitmap of sequential zones to set
> > + * @zno: Zone number of the first zone in the report
> 
> 'zno' is an input and output parameter but the above line only describes
> what
> happens with the value passed to sd_zbc_get_seq_zones(). I think we also
> need a
> description for the value of 'zno' upon return.

zno is not necessary since it can be calculated from the reported zone start
LBA. So I removed it in V6 to simplify.

-- 
Damien Le Moal
Western Digital

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH V5 10/14] block: mq-deadline: Add zoned block device data
  2017-09-25 21:34     ` Bart Van Assche
@ 2017-10-02  4:32       ` Damien Le Moal
  -1 siblings, 0 replies; 40+ messages in thread
From: Damien Le Moal @ 2017-10-02  4:32 UTC (permalink / raw)
  To: linux-scsi, linux-block, Bart Van Assche, martin.petersen, axboe; +Cc: hch

QmFydCwNCg0KT24gTW9uLCAyMDE3LTA5LTI1IGF0IDIxOjM0ICswMDAwLCBCYXJ0IFZhbiBBc3Nj
aGUgd3JvdGU6DQo+IE9uIE1vbiwgMjAxNy0wOS0yNSBhdCAxNToxNCArMDkwMCwgRGFtaWVuIExl
IE1vYWwgd3JvdGU6DQo+ID4gTW9kaWZ5IG1xLWRlYWxpbmUgaW5pdF9xdWV1ZSBhbmQgZXhpdF9x
dWV1ZSBlbGV2YXRvciBtZXRob2RzIHRvIGhhbmRsZQ0KPiANCj4gICAgICAgICAgXl5eXl5eXl5e
Xg0KPiAgICAgICAgICBtcS1kZWFkbGluZSA/DQo+IA0KPiA+ICtzdGF0aWMgaW50IGRlYWRsaW5l
X2luaXRfem9uZXNfd2xvY2soc3RydWN0IHJlcXVlc3RfcXVldWUgKnEsDQo+ID4gKwkJCQkgICAg
IHN0cnVjdCBkZWFkbGluZV9kYXRhICpkZCkNCj4gPiArew0KPiA+ICsJLyoNCj4gPiArCSAqIEZv
ciByZWd1bGFyIGRyaXZlcyBvciBub24tY29uZm9ybWluZyB6b25lZCBibG9jayBkZXZpY2UsDQo+
ID4gKwkgKiBkbyBub3QgdXNlIHpvbmUgd3JpdGUgbG9ja2luZy4NCj4gPiArCSAqLw0KPiA+ICsJ
aWYgKCFibGtfcXVldWVfbnJfem9uZXMocSkpDQo+ID4gKwkJcmV0dXJuIDA7DQo+ID4gKw0KPiA+
ICsJLyoNCj4gPiArCSAqIFRyZWF0IGhvc3QgYXdhcmUgZHJpdmVzIGFzIHJlZ3VsYXIgZGlza3Mu
DQo+ID4gKwkgKi8NCj4gPiArCWlmIChibGtfcXVldWVfem9uZWRfbW9kZWwocSkgIT0gQkxLX1pP
TkVEX0hNKQ0KPiA+ICsJCXJldHVybiAwOw0KPiA+ICsNCj4gPiArCWRkLT56b25lc193bG9jayA9
DQo+ID4ga3phbGxvY19ub2RlKEJJVFNfVE9fTE9OR1MoYmxrX3F1ZXVlX25yX3pvbmVzKHEpKQ0K
PiA+ICsJCQkJICAgICAgICogc2l6ZW9mKHVuc2lnbmVkIGxvbmcpLA0KPiA+ICsJCQkJICAgICAg
IEdGUF9LRVJORUwsIHEtPm5vZGUpOw0KPiANCj4gQSByZXF1ZXN0IHF1ZXVlIGlzIGNyZWF0ZWQg
YmVmb3JlIGRpc2sgdmFsaWRhdGlvbiBvY2N1cnMgYW5kIGJlZm9yZSB0aGUNCj4gbnVtYmVyIG9m
IHpvbmVzIGlzIGluaXRpYWxpemVkIChzZF9wcm9iZV9hc3luYygpKS4gSWYgYSBzY2hlZHVsZXIg
aXMNCj4gYXNzaWduZWQgdG8gYSBaQkMgZHJpdmUgdGhyb3VnaCBhIHVkZXYgcnVsZSwgY2FuIGl0
IGhhcHBlbiB0aGF0DQo+IGRlYWRsaW5lX2luaXRfem9uZXNfd2xvY2soKSBpcyBjYWxsZWQgYmVm
b3JlIHRoZSBudW1iZXIgb2Ygem9uZXMgaGFzIGJlZW4NCj4gaW5pdGlhbGl6ZWQ/DQoNClllcyBp
bmRlZWQuIEkgYW0gYXdhcmUgb2YgdGhpcywgaGVuY2UgdGhlIGxhc3QgcGF0Y2ggb2YgdGhlIHNl
cmllcyB3aGljaA0KZGlzYWJsZXMgc2V0dGluZyBhIGRlZmF1bHQgc2NoZWR1bGVyIGZvciBibGst
bXEgZGV2aWNlcy4NCg0KVGhhdCBpcyBob3dldmVyIGEgbGl0dGxlIGV4dHJlbWUuIFNvIGZvciBW
NiwgSSBhbSBsb29raW5nIGF0IGRpZmVycmluZyB0aGUNCmRlZmF1bHQgZWxldmF0b3Igc2V0dXAg
YWZ0ZXIgZGV2aWNlIGluZm9ybWF0aW9uIGlzIGdhdGhlcmVkLiBOb3Qgc3VyZSB3aGVyZQ0KdGhl
IHJpZ2h0IHBsYWNlIHRvIGRvIHNvIGlzIHRob3VnaC4gTG9va2luZy4NCg0KQmVzdCByZWdhcmRz
Lg0KDQotLSANCkRhbWllbiBMZSBNb2FsDQpXZXN0ZXJuIERpZ2l0YWw=

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH V5 10/14] block: mq-deadline: Add zoned block device data
@ 2017-10-02  4:32       ` Damien Le Moal
  0 siblings, 0 replies; 40+ messages in thread
From: Damien Le Moal @ 2017-10-02  4:32 UTC (permalink / raw)
  To: linux-scsi, linux-block, Bart Van Assche, martin.petersen, axboe; +Cc: hch

Bart,

On Mon, 2017-09-25 at 21:34 +0000, Bart Van Assche wrote:
> On Mon, 2017-09-25 at 15:14 +0900, Damien Le Moal wrote:
> > Modify mq-dealine init_queue and exit_queue elevator methods to handle
> 
>          ^^^^^^^^^^
>          mq-deadline ?
> 
> > +static int deadline_init_zones_wlock(struct request_queue *q,
> > +				     struct deadline_data *dd)
> > +{
> > +	/*
> > +	 * For regular drives or non-conforming zoned block device,
> > +	 * do not use zone write locking.
> > +	 */
> > +	if (!blk_queue_nr_zones(q))
> > +		return 0;
> > +
> > +	/*
> > +	 * Treat host aware drives as regular disks.
> > +	 */
> > +	if (blk_queue_zoned_model(q) != BLK_ZONED_HM)
> > +		return 0;
> > +
> > +	dd->zones_wlock =
> > kzalloc_node(BITS_TO_LONGS(blk_queue_nr_zones(q))
> > +				       * sizeof(unsigned long),
> > +				       GFP_KERNEL, q->node);
> 
> A request queue is created before disk validation occurs and before the
> number of zones is initialized (sd_probe_async()). If a scheduler is
> assigned to a ZBC drive through a udev rule, can it happen that
> deadline_init_zones_wlock() is called before the number of zones has been
> initialized?

Yes indeed. I am aware of this, hence the last patch of the series which
disables setting a default scheduler for blk-mq devices.

That is however a little extreme. So for V6, I am looking at diferring the
default elevator setup after device information is gathered. Not sure where
the right place to do so is though. Looking.

Best regards.

-- 
Damien Le Moal
Western Digital

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH V5 12/14] block: mq-deadline: Introduce zone locking support
  2017-09-25 22:00     ` Bart Van Assche
@ 2017-10-02  4:36       ` Damien Le Moal
  -1 siblings, 0 replies; 40+ messages in thread
From: Damien Le Moal @ 2017-10-02  4:36 UTC (permalink / raw)
  To: linux-scsi, linux-block, Bart Van Assche, martin.petersen, axboe; +Cc: hch

QmFydCwNCg0KT24gTW9uLCAyMDE3LTA5LTI1IGF0IDIyOjAwICswMDAwLCBCYXJ0IFZhbiBBc3Nj
aGUgd3JvdGU6DQo+IE9uIE1vbiwgMjAxNy0wOS0yNSBhdCAxNToxNCArMDkwMCwgRGFtaWVuIExl
IE1vYWwgd3JvdGU6DQo+ID4gK3N0YXRpYyBpbmxpbmUgYm9vbCBkZWFkbGluZV9yZXF1ZXN0X25l
ZWRzX3pvbmVfd2xvY2soc3RydWN0IGRlYWRsaW5lX2RhdGENCj4gPiAqZGQsDQo+ID4gKwkJCQkJ
CSAgICAgc3RydWN0IHJlcXVlc3QgKnJxKQ0KPiA+ICt7DQo+ID4gKw0KPiA+ICsJaWYgKCFkZC0+
em9uZXNfd2xvY2spDQo+ID4gKwkJcmV0dXJuIGZhbHNlOw0KPiA+ICsNCj4gPiArCWlmIChibGtf
cnFfaXNfcGFzc3Rocm91Z2gocnEpKQ0KPiA+ICsJCXJldHVybiBmYWxzZTsNCj4gPiArDQo+ID4g
Kwlzd2l0Y2ggKHJlcV9vcChycSkpIHsNCj4gPiArCWNhc2UgUkVRX09QX1dSSVRFX1pFUk9FUzoN
Cj4gPiArCWNhc2UgUkVRX09QX1dSSVRFX1NBTUU6DQo+ID4gKwljYXNlIFJFUV9PUF9XUklURToN
Cj4gPiArCQlyZXR1cm4gYmxrX3JxX3pvbmVfaXNfc2VxKHJxKTsNCj4gPiArCWRlZmF1bHQ6DQo+
ID4gKwkJcmV0dXJuIGZhbHNlOw0KPiA+ICsJfQ0KPiANCj4gSWYgYW55b25lIGV2ZXIgYWRkcyBh
IG5ldyB3cml0ZSByZXF1ZXN0IHR5cGUgaXQgd2lsbCBiZSBlYXN5IHRvIG92ZXJsb29rDQo+IHRo
aXMNCj4gZnVuY3Rpb24uIFNob3VsZCB0aGUgJ2RlZmF1bHQnIGNhc2UgYmUgbGVmdCBvdXQgYW5k
IHNob3VsZCBhbGwgcmVxdWVzdCB0eXBlcw0KPiBiZSBtZW50aW9uZWQgaW4gdGhlIHN3aXRjaC9j
YXNlIHN0YXRlbWVudCBzdWNoIHRoYXQgdGhlIGNvbXBpbGVyIHdpbGwgaXNzdWUNCj4gYQ0KPiB3
YXJuaW5nIGlmIGEgbmV3IHJlcXVlc3Qgb3BlcmF0aW9uIHR5cGUgaXMgYWRkZWQgdG8gZW51bSBy
ZXFfb3BmPw0KDQpJIHRyaWVkLCBidXQgdGhhdCBkb2VzIG5vdCB3b3JrLiBUaGUgc3dpdGNoLWNh
c2UgbmVlZHMgZWl0aGVyIGEgZGVmYXVsdCBjYXNlDQpvciBhIHJldHVybiBhZnRlciBpdC4gT3Ro
ZXJ3aXNlIEkgZ2V0IGEgY29tcGlsYXRpb24gd2FybmluZyAocmVhY2hlZCBlbmQgb2YNCm5vbi12
b2lkIGZ1bmN0aW9uKS4NCg0KPiA+ICsvKg0KPiA+ICsgKiBBYnVzZSB0aGUgZWx2LnByaXZbMF0g
cG9pbnRlciB0byBpbmRpY2F0ZSBpZiBhIHJlcXVlc3QgaGFzIHdyaXRlDQo+ID4gKyAqIGxvY2tl
ZCBpdHMgdGFyZ2V0IHpvbmUuIE9ubHkgd3JpdGUgcmVxdWVzdCB0byBhIHpvbmVkIGJsb2NrIGRl
dmljZQ0KPiA+ICsgKiBjYW4gb3duIGEgem9uZSB3cml0ZSBsb2NrLg0KPiA+ICsgKi8NCj4gPiAr
I2RlZmluZSBSUV9aT05FX1dMT0NLRUQJCSgodm9pZCAqKTFVTCkNCj4gPiArc3RhdGljIGlubGlu
ZSB2b2lkIGRlYWRsaW5lX3NldF9yZXF1ZXN0X3pvbmVfd2xvY2soc3RydWN0IHJlcXVlc3QgKnJx
KQ0KPiA+ICt7DQo+ID4gKwlycS0+ZWx2LnByaXZbMF0gPSBSUV9aT05FX1dMT0NLRUQ7DQo+ID4g
K30NCj4gPiArDQo+ID4gKyNkZWZpbmUgUlFfWk9ORV9OT19XTE9DSwkoKHZvaWQgKikwVUwpDQo+
ID4gK3N0YXRpYyBpbmxpbmUgdm9pZCBkZWFkbGluZV9jbGVhcl9yZXF1ZXN0X3pvbmVfd2xvY2so
c3RydWN0IHJlcXVlc3QgKnJxKQ0KPiA+ICt7DQo+ID4gKwlycS0+ZWx2LnByaXZbMF0gPSBSUV9a
T05FX05PX1dMT0NLOw0KPiA+ICt9DQo+IA0KPiBTaG91bGQgYW4gZW51bWVyYXRpb24gdHlwZSBi
ZSBpbnRyb2R1Y2VkIGZvciBSUV9aT05FX1dMT0NLRUQgYW5kDQo+IFJRX1pPTkVfTk9fV0xPQ0s/
DQoNClN1cmUuIEFkZGVkIGluIFY2Lg0KDQo+ID4gKy8qDQo+ID4gKyAqIFdyaXRlIGxvY2sgdGhl
IHRhcmdldCB6b25lIG9mIGEgd3JpdGUgcmVxdWVzdC4NCj4gPiArICovDQo+ID4gK3N0YXRpYyB2
b2lkIGRlYWRsaW5lX3dsb2NrX3pvbmUoc3RydWN0IGRlYWRsaW5lX2RhdGEgKmRkLA0KPiA+ICsJ
CQkJc3RydWN0IHJlcXVlc3QgKnJxKQ0KPiA+ICt7DQo+ID4gKwl1bnNpZ25lZCBpbnQgem5vID0g
YmxrX3JxX3pvbmVfbm8ocnEpOw0KPiA+ICsNCj4gPiArCVdBUk5fT05fT05DRShkZWFkbGluZV9y
ZXF1ZXN0X2hhc196b25lX3dsb2NrKHJxKSk7DQo+ID4gKwlXQVJOX09OX09OQ0UodGVzdF9hbmRf
c2V0X2JpdCh6bm8sIGRkLT56b25lc193bG9jaykpOw0KPiA+ICsJZGVhZGxpbmVfc2V0X3JlcXVl
c3Rfem9uZV93bG9jayhycSk7DQo+ID4gK30NCj4gPiArDQo+ID4gKy8qDQo+ID4gKyAqIFdyaXRl
IHVubG9jayB0aGUgdGFyZ2V0IHpvbmUgb2YgYSB3cml0ZSByZXF1ZXN0Lg0KPiA+ICsgKi8NCj4g
PiArc3RhdGljIHZvaWQgZGVhZGxpbmVfd3VubG9ja196b25lKHN0cnVjdCBkZWFkbGluZV9kYXRh
ICpkZCwNCj4gPiArCQkJCSAgc3RydWN0IHJlcXVlc3QgKnJxKQ0KPiA+ICt7DQo+ID4gKwl1bnNp
Z25lZCBpbnQgem5vID0gYmxrX3JxX3pvbmVfbm8ocnEpOw0KPiA+ICsJdW5zaWduZWQgbG9uZyBm
bGFnczsNCj4gPiArDQo+ID4gKwlzcGluX2xvY2tfaXJxc2F2ZSgmZGQtPnpvbmVfbG9jaywgZmxh
Z3MpOw0KPiA+ICsNCj4gPiArCVdBUk5fT05fT05DRSghdGVzdF9hbmRfY2xlYXJfYml0KHpubywg
ZGQtPnpvbmVzX3dsb2NrKSk7DQo+ID4gKwlkZWFkbGluZV9jbGVhcl9yZXF1ZXN0X3pvbmVfd2xv
Y2socnEpOw0KPiA+ICsNCj4gPiArCXNwaW5fdW5sb2NrX2lycXJlc3RvcmUoJmRkLT56b25lX2xv
Y2ssIGZsYWdzKTsNCj4gPiArfQ0KPiANCj4gV2h5IGRvZXMgZGVhZGxpbmVfd3VubG9ja196b25l
KCkgcHJvdGVjdCBtb2RpZmljYXRpb25zIHdpdGggZGQtPnpvbmVfbG9jaw0KPiBidXQNCj4gZGVh
ZGxpbmVfd2xvY2tfem9uZSgpIG5vdD8gSWYgdGhpcyBjb2RlIGlzIGNvcnJlY3QsIHBsZWFzZSBh
ZGQgYQ0KPiBsb2NrZGVwX2Fzc2VydF9oZWxkKCkgc3RhdGVtZW50IGluIHRoZSBmaXJzdCBmdW5j
dGlvbi4NCg0KWWVzLCB0aGF0IHdhcyBhIGxpdHRsZSBjb25mdXNpbmcuIEluIFY2LCBJIG1vdmUg
dGhlIGludHJvZHVjdGlvbiBvZiB0aGUNCnpvbmVfbG9jayBzcGlubG9jayB0byB3aGVuIGl0IGlz
IGFjdHVhbGx5IG5lZWRlZCwgdGhhdCBpcyB0aGUgcGF0Y2ggZm9sbG93aW5nDQp0aGlzIG9uZS4g
QW5kIEkgYWRkZWQgbW9yZSBjb21tZW50cyBpbiBib3RoIHRoZSBjb21taXQgbWVzc2FnZSBhbmQg
aW4gdGhlIGNvZGUNCnRvIGV4cGxhaW4gd2h5IHRoZSBzcGlubG9jayBpcyBuZWVkZWQuDQoNCj4g
PiArLyoNCj4gPiArICogVGVzdCB0aGUgd3JpdGUgbG9jayBzdGF0ZSBvZiB0aGUgdGFyZ2V0IHpv
bmUgb2YgYSB3cml0ZSByZXF1ZXN0Lg0KPiA+ICsgKi8NCj4gPiArc3RhdGljIGlubGluZSBib29s
IGRlYWRsaW5lX3pvbmVfaXNfd2xvY2tlZChzdHJ1Y3QgZGVhZGxpbmVfZGF0YSAqZGQsDQo+ID4g
KwkJCQkJICAgIHN0cnVjdCByZXF1ZXN0ICpycSkNCj4gPiArew0KPiA+ICsJdW5zaWduZWQgaW50
IHpubyA9IGJsa19ycV96b25lX25vKHJxKTsNCj4gPiArDQo+ID4gKwlyZXR1cm4gdGVzdF9iaXQo
em5vLCBkZC0+em9uZXNfd2xvY2spOw0KPiA+ICt9DQo+IA0KPiBEbyB3ZSByZWFsbHkgbmVlZCB0
aGUgbG9jYWwgdmFyaWFibGUgJ3pubyc/DQoNCk5vIHdlIGRvbid0LiBGaXhlZC4NCg0KQmVzdCBy
ZWdhcmRzLg0KDQotLSANCkRhbWllbiBMZSBNb2FsDQpXZXN0ZXJuIERpZ2l0YWw=

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH V5 12/14] block: mq-deadline: Introduce zone locking support
@ 2017-10-02  4:36       ` Damien Le Moal
  0 siblings, 0 replies; 40+ messages in thread
From: Damien Le Moal @ 2017-10-02  4:36 UTC (permalink / raw)
  To: linux-scsi, linux-block, Bart Van Assche, martin.petersen, axboe; +Cc: hch

Bart,

On Mon, 2017-09-25 at 22:00 +0000, Bart Van Assche wrote:
> On Mon, 2017-09-25 at 15:14 +0900, Damien Le Moal wrote:
> > +static inline bool deadline_request_needs_zone_wlock(struct deadline_data
> > *dd,
> > +						     struct request *rq)
> > +{
> > +
> > +	if (!dd->zones_wlock)
> > +		return false;
> > +
> > +	if (blk_rq_is_passthrough(rq))
> > +		return false;
> > +
> > +	switch (req_op(rq)) {
> > +	case REQ_OP_WRITE_ZEROES:
> > +	case REQ_OP_WRITE_SAME:
> > +	case REQ_OP_WRITE:
> > +		return blk_rq_zone_is_seq(rq);
> > +	default:
> > +		return false;
> > +	}
> 
> If anyone ever adds a new write request type it will be easy to overlook
> this
> function. Should the 'default' case be left out and should all request types
> be mentioned in the switch/case statement such that the compiler will issue
> a
> warning if a new request operation type is added to enum req_opf?

I tried, but that does not work. The switch-case needs either a default case
or a return after it. Otherwise I get a compilation warning (reached end of
non-void function).

> > +/*
> > + * Abuse the elv.priv[0] pointer to indicate if a request has write
> > + * locked its target zone. Only write request to a zoned block device
> > + * can own a zone write lock.
> > + */
> > +#define RQ_ZONE_WLOCKED		((void *)1UL)
> > +static inline void deadline_set_request_zone_wlock(struct request *rq)
> > +{
> > +	rq->elv.priv[0] = RQ_ZONE_WLOCKED;
> > +}
> > +
> > +#define RQ_ZONE_NO_WLOCK	((void *)0UL)
> > +static inline void deadline_clear_request_zone_wlock(struct request *rq)
> > +{
> > +	rq->elv.priv[0] = RQ_ZONE_NO_WLOCK;
> > +}
> 
> Should an enumeration type be introduced for RQ_ZONE_WLOCKED and
> RQ_ZONE_NO_WLOCK?

Sure. Added in V6.

> > +/*
> > + * Write lock the target zone of a write request.
> > + */
> > +static void deadline_wlock_zone(struct deadline_data *dd,
> > +				struct request *rq)
> > +{
> > +	unsigned int zno = blk_rq_zone_no(rq);
> > +
> > +	WARN_ON_ONCE(deadline_request_has_zone_wlock(rq));
> > +	WARN_ON_ONCE(test_and_set_bit(zno, dd->zones_wlock));
> > +	deadline_set_request_zone_wlock(rq);
> > +}
> > +
> > +/*
> > + * Write unlock the target zone of a write request.
> > + */
> > +static void deadline_wunlock_zone(struct deadline_data *dd,
> > +				  struct request *rq)
> > +{
> > +	unsigned int zno = blk_rq_zone_no(rq);
> > +	unsigned long flags;
> > +
> > +	spin_lock_irqsave(&dd->zone_lock, flags);
> > +
> > +	WARN_ON_ONCE(!test_and_clear_bit(zno, dd->zones_wlock));
> > +	deadline_clear_request_zone_wlock(rq);
> > +
> > +	spin_unlock_irqrestore(&dd->zone_lock, flags);
> > +}
> 
> Why does deadline_wunlock_zone() protect modifications with dd->zone_lock
> but
> deadline_wlock_zone() not? If this code is correct, please add a
> lockdep_assert_held() statement in the first function.

Yes, that was a little confusing. In V6, I move the introduction of the
zone_lock spinlock to when it is actually needed, that is the patch following
this one. And I added more comments in both the commit message and in the code
to explain why the spinlock is needed.

> > +/*
> > + * Test the write lock state of the target zone of a write request.
> > + */
> > +static inline bool deadline_zone_is_wlocked(struct deadline_data *dd,
> > +					    struct request *rq)
> > +{
> > +	unsigned int zno = blk_rq_zone_no(rq);
> > +
> > +	return test_bit(zno, dd->zones_wlock);
> > +}
> 
> Do we really need the local variable 'zno'?

No we don't. Fixed.

Best regards.

-- 
Damien Le Moal
Western Digital

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH V5 13/14] block: mq-deadline: Limit write request dispatch for zoned block devices
  2017-09-25 22:06     ` Bart Van Assche
@ 2017-10-02  4:38       ` Damien Le Moal
  -1 siblings, 0 replies; 40+ messages in thread
From: Damien Le Moal @ 2017-10-02  4:38 UTC (permalink / raw)
  To: linux-scsi, linux-block, Bart Van Assche, martin.petersen, axboe; +Cc: hch

T24gTW9uLCAyMDE3LTA5LTI1IGF0IDIyOjA2ICswMDAwLCBCYXJ0IFZhbiBBc3NjaGUgd3JvdGU6
DQo+IE9uIE1vbiwgMjAxNy0wOS0yNSBhdCAxNToxNCArMDkwMCwgRGFtaWVuIExlIE1vYWwgd3Jv
dGU6DQo+ID4gLQlyZXR1cm4gcnFfZW50cnlfZmlmbyhkZC0+Zmlmb19saXN0W2RhdGFfZGlyXS5u
ZXh0KTsNCj4gPiArCWlmICghZGQtPnpvbmVzX3dsb2NrIHx8IGRhdGFfZGlyID09IFJFQUQpDQo+
ID4gKwkJcmV0dXJuIHJxX2VudHJ5X2ZpZm8oZGQtPmZpZm9fbGlzdFtkYXRhX2Rpcl0ubmV4dCk7
DQo+ID4gKw0KPiA+ICsJc3Bpbl9sb2NrX2lycXNhdmUoJmRkLT56b25lX2xvY2ssIGZsYWdzKTsN
Cj4gPiArDQo+ID4gKwlsaXN0X2Zvcl9lYWNoX2VudHJ5KHJxLCAmZGQtPmZpZm9fbGlzdFtXUklU
RV0sIHF1ZXVlbGlzdCkgew0KPiA+ICsJCWlmIChkZWFkbGluZV9jYW5fZGlzcGF0Y2hfcmVxdWVz
dChkZCwgcnEpKQ0KPiA+ICsJCQlnb3RvIG91dDsNCj4gPiArCX0NCj4gPiArCXJxID0gTlVMTDsN
Cj4gPiArDQo+ID4gK291dDoNCj4gPiArCXNwaW5fdW5sb2NrX2lycXJlc3RvcmUoJmRkLT56b25l
X2xvY2ssIGZsYWdzKTsNCj4gDQo+IElzIGl0IGRvY3VtZW50ZWQgc29tZXdoZXJlIHdoYXQgZGQt
PnpvbmVfbG9jayBwcm90ZWN0cyBhbmQgd2hlbiB0aGF0IGxvY2sNCj4gc2hvdWxkIGJlDQo+IGFj
cXVpcmVkPw0KDQpJdCB3YXMgbm90IHdlbGwgZXhwbGFpbmVkLiBJIGFkZGVkIGNvbW1lbnRzIGlu
IFY2Lg0KDQo+IA0KPiA+IAkvKg0KPiA+ICAJICogVGhpcyBtYXkgYmUgYSByZXF1ZXVlIG9mIGEg
cmVxdWVzdCB0aGF0IGhhcyBsb2NrZWQgaXRzDQo+ID4gLQkgKiB0YXJnZXQgem9uZS4gSWYgdGhp
cyBpcyB0aGUgY2FzZSwgcmVsZWFzZSB0aGUgcmVxdWVzdCB6b25lDQo+ID4gbG9jay4NCj4gPiAr
CSAqIHRhcmdldCB6b25lLiBJZiB0aGlzIGlzIHRoZSBjYXNlLCByZWxlYXNlIHRoZSB6b25lIGxv
Y2suDQo+ID4gIAkgKi8NCj4gPiAgCWlmIChkZWFkbGluZV9yZXF1ZXN0X2hhc196b25lX3dsb2Nr
KHJxKSkNCj4gPiAgCQlkZWFkbGluZV93dW5sb2NrX3pvbmUoZGQsIHJxKTsNCj4gDQo+IENhbiB0
aGlzIGNoYW5nZSBiZSBmb2xkZWQgaW50byB0aGUgcGF0Y2ggdGhhdCBpbnRyb2R1Y2VkIHRoYXQg
Y29tbWVudD8NCg0KT2YgY291cnNlLiBGaXhlZCBpbiBWNi4NCg0KPiA+IEBAIC01NzAsNiArNjIx
LDkgQEAgc3RhdGljIHZvaWQgZGRfaW5zZXJ0X3JlcXVlc3Qoc3RydWN0IGJsa19tcV9od19jdHgN
Cj4gPiAqaGN0eCwgc3RydWN0IHJlcXVlc3QgKnJxLA0KPiA+ICANCj4gPiAgCWJsa19tcV9zY2hl
ZF9yZXF1ZXN0X2luc2VydGVkKHJxKTsNCj4gPiAgDQo+ID4gKwlpZiAoYXRfaGVhZCAmJiBkZWFk
bGluZV9yZXF1ZXN0X25lZWRzX3pvbmVfd2xvY2soZGQsIHJxKSkNCj4gPiArCQlwcl9pbmZvKCIj
IyMjIyMjIyBXcml0ZSBhdCBoZWFkICFcbiIpOw0KPiA+ICsNCj4gPiAgCWlmIChhdF9oZWFkIHx8
IGJsa19ycV9pc19wYXNzdGhyb3VnaChycSkpIHsNCj4gPiAgCQlpZiAoYXRfaGVhZCkNCj4gPiAg
CQkJbGlzdF9hZGQoJnJxLT5xdWV1ZWxpc3QsICZkZC0+ZGlzcGF0Y2gpOw0KPiANCj4gV2lsbCBp
dCBiZSBlYXN5IHRvIHVzZXJzIHdobyBhbmFseXplIGEga2VybmVsIGxvZyB0byBmaWd1cmUgb3V0
IHdoeSB0aGF0DQo+IG1lc3NhZ2UgaGFzIGJlZW4gZ2VuZXJhdGVkPyBTaG91bGQgdGhhdCBtZXNz
YWdlIHBlcmhhcHMgaW5jbHVkZSB0aGUgYmxvY2sNCj4gZGV2aWNlIG5hbWUsIHpvbmUgbnVtYmVy
IGFuZCByZXF1ZXN0IHNlY3RvciBudW1iZXI/DQoNClRoaXMgd2FzIGp1c3QgYSBkZWJ1ZyBtZXNz
YWdlIGZvciBtZSB0aGF0IEkgZm9yZ290IHRvIHJlbW92ZS4gSSBkaWQgaW4gVjYuDQpUaGFua3Mg
Zm9yIGNhdGNoaW5nIHRoaXMuDQoNCkJlc3QgcmVnYXJkcy4NCg0KLS0gDQpEYW1pZW4gTGUgTW9h
bA0KV2VzdGVybiBEaWdpdGFs

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH V5 13/14] block: mq-deadline: Limit write request dispatch for zoned block devices
@ 2017-10-02  4:38       ` Damien Le Moal
  0 siblings, 0 replies; 40+ messages in thread
From: Damien Le Moal @ 2017-10-02  4:38 UTC (permalink / raw)
  To: linux-scsi, linux-block, Bart Van Assche, martin.petersen, axboe; +Cc: hch

On Mon, 2017-09-25 at 22:06 +0000, Bart Van Assche wrote:
> On Mon, 2017-09-25 at 15:14 +0900, Damien Le Moal wrote:
> > -	return rq_entry_fifo(dd->fifo_list[data_dir].next);
> > +	if (!dd->zones_wlock || data_dir == READ)
> > +		return rq_entry_fifo(dd->fifo_list[data_dir].next);
> > +
> > +	spin_lock_irqsave(&dd->zone_lock, flags);
> > +
> > +	list_for_each_entry(rq, &dd->fifo_list[WRITE], queuelist) {
> > +		if (deadline_can_dispatch_request(dd, rq))
> > +			goto out;
> > +	}
> > +	rq = NULL;
> > +
> > +out:
> > +	spin_unlock_irqrestore(&dd->zone_lock, flags);
> 
> Is it documented somewhere what dd->zone_lock protects and when that lock
> should be
> acquired?

It was not well explained. I added comments in V6.

> 
> > 	/*
> >  	 * This may be a requeue of a request that has locked its
> > -	 * target zone. If this is the case, release the request zone
> > lock.
> > +	 * target zone. If this is the case, release the zone lock.
> >  	 */
> >  	if (deadline_request_has_zone_wlock(rq))
> >  		deadline_wunlock_zone(dd, rq);
> 
> Can this change be folded into the patch that introduced that comment?

Of course. Fixed in V6.

> > @@ -570,6 +621,9 @@ static void dd_insert_request(struct blk_mq_hw_ctx
> > *hctx, struct request *rq,
> >  
> >  	blk_mq_sched_request_inserted(rq);
> >  
> > +	if (at_head && deadline_request_needs_zone_wlock(dd, rq))
> > +		pr_info("######## Write at head !\n");
> > +
> >  	if (at_head || blk_rq_is_passthrough(rq)) {
> >  		if (at_head)
> >  			list_add(&rq->queuelist, &dd->dispatch);
> 
> Will it be easy to users who analyze a kernel log to figure out why that
> message has been generated? Should that message perhaps include the block
> device name, zone number and request sector number?

This was just a debug message for me that I forgot to remove. I did in V6.
Thanks for catching this.

Best regards.

-- 
Damien Le Moal
Western Digital

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH V5 02/14] scsi: sd_zbc: Fix comments and indentation
  2017-09-25  6:14 ` [PATCH V5 02/14] scsi: sd_zbc: Fix comments and indentation Damien Le Moal
@ 2017-10-02 23:01     ` Bart Van Assche
  0 siblings, 0 replies; 40+ messages in thread
From: Bart Van Assche @ 2017-10-02 23:01 UTC (permalink / raw)
  To: linux-scsi, linux-block, Damien Le Moal, martin.petersen, axboe; +Cc: hch

T24gTW9uLCAyMDE3LTA5LTI1IGF0IDE1OjE0ICswOTAwLCBEYW1pZW4gTGUgTW9hbCB3cm90ZToN
Cj4gKyAqIHNkX3piY196b25lX25vIC0gR2V0IHRoZSBudW1iZXIgb2YgdGhlIHpvbmUgY29uYXRh
aW5pbmcgYSBzZWN0b3IuDQoNClBsZWFzZSBmaXggdGhlIHNwZWxsaW5nIGluIHRoZSBhYm92ZSBj
b21tZW50ICgiY29uYXRhaW5pbmciKS4NCg0KQW55d2F5Og0KDQpSZXZpZXdlZC1ieTogQmFydCBW
YW4gQXNzY2hlIDxCYXJ0LlZhbkFzc2NoZUB3ZGMuY29tPg0KDQo=

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH V5 02/14] scsi: sd_zbc: Fix comments and indentation
@ 2017-10-02 23:01     ` Bart Van Assche
  0 siblings, 0 replies; 40+ messages in thread
From: Bart Van Assche @ 2017-10-02 23:01 UTC (permalink / raw)
  To: linux-scsi, linux-block, Damien Le Moal, martin.petersen, axboe; +Cc: hch

On Mon, 2017-09-25 at 15:14 +0900, Damien Le Moal wrote:
> + * sd_zbc_zone_no - Get the number of the zone conataining a sector.

Please fix the spelling in the above comment ("conataining").

Anyway:

Reviewed-by: Bart Van Assche <Bart.VanAssche@wdc.com>


^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2017-10-02 23:01 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-25  6:14 [PATCH V5 00/14] scsi-mq support for ZBC disks Damien Le Moal
2017-09-25  6:14 ` [PATCH V5 01/14] scsi: sd_zbc: Move ZBC declarations to scsi_proto.h Damien Le Moal
2017-09-25  6:14 ` [PATCH V5 02/14] scsi: sd_zbc: Fix comments and indentation Damien Le Moal
2017-10-02 23:01   ` Bart Van Assche
2017-10-02 23:01     ` Bart Van Assche
2017-09-25  6:14 ` [PATCH V5 03/14] scsi: sd_zbc: Rearrange code Damien Le Moal
2017-09-25 21:02   ` Bart Van Assche
2017-09-25 21:02     ` Bart Van Assche
2017-09-25  6:14 ` [PATCH V5 04/14] scsi: sd_zbc: Use well defined macros Damien Le Moal
2017-09-25  6:14 ` [PATCH V5 05/14] scsi: sd_zbc: Fix sd_zbc_read_zoned_characteristics() Damien Le Moal
2017-09-25  6:14 ` [PATCH V5 06/14] block: Add zoned block device information to request queue Damien Le Moal
2017-09-25 10:05   ` Ming Lei
2017-09-25 21:06   ` Bart Van Assche
2017-09-25 21:06     ` Bart Van Assche
2017-09-25  6:14 ` [PATCH V5 07/14] scsi: sd_zbc: Initialize device request queue zoned data Damien Le Moal
2017-09-25 21:17   ` Bart Van Assche
2017-09-25 21:17     ` Bart Van Assche
2017-10-02  4:29     ` Damien Le Moal
2017-10-02  4:29       ` Damien Le Moal
2017-09-25  6:14 ` [PATCH V5 08/14] scsi: sd_zbc: Limit zone write locking to sequential zones Damien Le Moal
2017-09-25  6:14 ` [PATCH V5 09/14] scsi: sd_zbc: Disable zone write locking with scsi-mq Damien Le Moal
2017-09-25  6:14 ` [PATCH V5 10/14] block: mq-deadline: Add zoned block device data Damien Le Moal
2017-09-25 21:34   ` Bart Van Assche
2017-09-25 21:34     ` Bart Van Assche
2017-10-02  4:32     ` Damien Le Moal
2017-10-02  4:32       ` Damien Le Moal
2017-09-25  6:14 ` [PATCH V5 11/14] blokc: mq-deadline: Introduce dispatch helpers Damien Le Moal
2017-09-25 21:44   ` Bart Van Assche
2017-09-25 21:44     ` Bart Van Assche
2017-09-25  6:14 ` [PATCH V5 12/14] block: mq-deadline: Introduce zone locking support Damien Le Moal
2017-09-25 22:00   ` Bart Van Assche
2017-09-25 22:00     ` Bart Van Assche
2017-10-02  4:36     ` Damien Le Moal
2017-10-02  4:36       ` Damien Le Moal
2017-09-25  6:14 ` [PATCH V5 13/14] block: mq-deadline: Limit write request dispatch for zoned block devices Damien Le Moal
2017-09-25 22:06   ` Bart Van Assche
2017-09-25 22:06     ` Bart Van Assche
2017-10-02  4:38     ` Damien Le Moal
2017-10-02  4:38       ` Damien Le Moal
2017-09-25  6:14 ` [PATCH V5 14/14] block: do not set mq default scheduler Damien Le Moal

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.