[PATCH v5 0/7] Improve the performance for zoned UFS devices

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v5 0/7] Improve the performance for zoned UFS devices
@ 2023-07-31 22:14 Bart Van Assche
  2023-07-31 22:14 ` [PATCH v5 1/7] block: Introduce the flag QUEUE_FLAG_NO_ZONE_WRITE_LOCK Bart Van Assche
                   ` (6 more replies)
  0 siblings, 7 replies; 17+ messages in thread
From: Bart Van Assche @ 2023-07-31 22:14 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Christoph Hellwig, Martin K . Petersen, Bart Van Assche

Hi Jens,

This patch series improves small write IOPS by a factor of four (+300%) for
zoned UFS devices on my test setup with a UFSHCI 3.0 controller. Please
consider this patch series for the next merge window.

Thank you,

Bart.

Changes compared to v4:
 - Dropped the patch that introduces the REQ_NO_ZONE_WRITE_LOCK flag.
 - Dropped the null_blk patch and added two scsi_debug patches instead.
 - Dropped the f2fs patch.
 - Split the patch for the UFS driver into two patches.
 - Modified several patch descriptions and source code comments.
 - Renamed dd_use_write_locking() into dd_use_zone_write_locking().
 - Moved the list_sort() call from scsi_unjam_host() into scsi_eh_flush_done_q()
   such that sorting happens just before reinserting.
 - Removed the scsi_cmd_retry_allowed() call from scsi_check_sense() to make
   sure that the retry counter is adjusted once per retry instead of twice.

Changes compared to v3:
 - Restored the patch that introduces QUEUE_FLAG_NO_ZONE_WRITE_LOCK. That patch
   had accidentally been left out from v2.
 - In patch "block: Introduce the flag REQ_NO_ZONE_WRITE_LOCK", improved the
   patch description and added the function blk_no_zone_write_lock().
 - In patch "block/mq-deadline: Only use zone locking if necessary", moved the
   blk_queue_is_zoned() call into dd_use_write_locking().
 - In patch "fs/f2fs: Disable zone write locking", set REQ_NO_ZONE_WRITE_LOCK
   from inside __bio_alloc() instead of in f2fs_submit_write_bio().

Changes compared to v2:
 - Renamed the request queue flag for disabling zone write locking.
 - Introduced a new request flag for disabling zone write locking.
 - Modified the mq-deadline scheduler such that zone write locking is only
   disabled if both flags are set.
 - Added an F2FS patch that sets the request flag for disabling zone write
   locking.
 - Only disable zone write locking in the UFS driver if auto-hibernation is
   disabled.

Changes compared to v1:
 - Left out the patches that are already upstream.
 - Switched the approach in patch "scsi: Retry unaligned zoned writes" from
   retrying immediately to sending unaligned write commands to the SCSI error
   handler.

Bart Van Assche (7):
  scsi: scsi_debug: Support disabling zone write locking
  scsi: scsi_debug: Support injecting unaligned write errors
  scsi: ufs: Split an if-condition
  scsi: ufs: Disable zone write locking
  scsi: core: Report error list information in debugfs
  scsi: core: Add a precondition check in scsi_eh_scmd_add()
  scsi: scsi_debug: Support injecting unaligned write errors

 drivers/scsi/scsi_debug.c   | 21 +++++++++++++++--
 drivers/scsi/scsi_debugfs.c | 25 ++++++++++++++++++---
 drivers/scsi/scsi_error.c   |  1 +
 drivers/ufs/core/ufshcd.c   | 45 ++++++++++++++++++++++++++++++++++---
 4 files changed, 84 insertions(+), 8 deletions(-)


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v5 1/7] block: Introduce the flag QUEUE_FLAG_NO_ZONE_WRITE_LOCK
  2023-07-31 22:14 [PATCH v5 0/7] Improve the performance for zoned UFS devices Bart Van Assche
@ 2023-07-31 22:14 ` Bart Van Assche
  2023-08-02  9:12   ` Damien Le Moal
  2023-07-31 22:14 ` [PATCH v5 2/7] block/mq-deadline: Only use zone locking if necessary Bart Van Assche
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 17+ messages in thread
From: Bart Van Assche @ 2023-07-31 22:14 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Christoph Hellwig, Martin K . Petersen,
	Bart Van Assche, Damien Le Moal, Ming Lei

Writes in sequential write required zones must happen at the write
pointer. Even if the submitter of the write commands (e.g. a filesystem)
submits writes for sequential write required zones in order, the block
layer or the storage controller may reorder these write commands.

The zone locking mechanism in the mq-deadline I/O scheduler serializes
write commands for sequential zones. Some but not all storage controllers
require this serialization. Introduce a new request queue flag to allow
block drivers to indicate that they preserve the order of write commands
and thus do not require serialization of writes per zone.

Cc: Christoph Hellwig <hch@lst.de>
Cc: Damien Le Moal <dlemoal@kernel.org>
Cc: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 include/linux/blkdev.h | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 2f5371b8482c..de5e05cc34fa 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -534,6 +534,11 @@ struct request_queue {
 #define QUEUE_FLAG_NONROT	6	/* non-rotational device (SSD) */
 #define QUEUE_FLAG_VIRT		QUEUE_FLAG_NONROT /* paravirt device */
 #define QUEUE_FLAG_IO_STAT	7	/* do disk/partitions IO accounting */
+/*
+ * Do not serialize sequential writes (REQ_OP_WRITE, REQ_OP_WRITE_ZEROES) sent
+ * to a sequential write required zone (BLK_ZONE_TYPE_SEQWRITE_REQ).
+ */
+#define QUEUE_FLAG_NO_ZONE_WRITE_LOCK 8
 #define QUEUE_FLAG_NOXMERGES	9	/* No extended merges */
 #define QUEUE_FLAG_ADD_RANDOM	10	/* Contributes to random pool */
 #define QUEUE_FLAG_SYNCHRONOUS	11	/* always completes in submit context */
@@ -597,6 +602,11 @@ bool blk_queue_flag_test_and_set(unsigned int flag, struct request_queue *q);
 #define blk_queue_skip_tagset_quiesce(q) \
 	test_bit(QUEUE_FLAG_SKIP_TAGSET_QUIESCE, &(q)->queue_flags)
 
+static inline bool blk_queue_no_zone_write_lock(struct request_queue *q)
+{
+	return test_bit(QUEUE_FLAG_NO_ZONE_WRITE_LOCK, &q->queue_flags);
+}
+
 extern void blk_set_pm_only(struct request_queue *q);
 extern void blk_clear_pm_only(struct request_queue *q);
 

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v5 2/7] block/mq-deadline: Only use zone locking if necessary
  2023-07-31 22:14 [PATCH v5 0/7] Improve the performance for zoned UFS devices Bart Van Assche
  2023-07-31 22:14 ` [PATCH v5 1/7] block: Introduce the flag QUEUE_FLAG_NO_ZONE_WRITE_LOCK Bart Van Assche
@ 2023-07-31 22:14 ` Bart Van Assche
  2023-08-02  9:12   ` Damien Le Moal
  2023-07-31 22:14 ` [PATCH v5 3/7] scsi: core: Retry unaligned zoned writes Bart Van Assche
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 17+ messages in thread
From: Bart Van Assche @ 2023-07-31 22:14 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Christoph Hellwig, Martin K . Petersen,
	Bart Van Assche, Damien Le Moal, Ming Lei

Measurements have shown that limiting the queue depth to one per zone for
zoned writes has a significant negative performance impact on zoned UFS
devices. Hence this patch that disables zone locking by the mq-deadline
scheduler if the storage controller preserves the command order. This
patch is based on the following assumptions:
- It happens infrequently that zoned write requests are reordered by the
  block layer.
- The I/O priority of all write requests is the same per zone.
- Either no I/O scheduler is used or an I/O scheduler is used that
  submits write requests per zone in LBA order.

Cc: Christoph Hellwig <hch@lst.de>
Cc: Damien Le Moal <dlemoal@kernel.org>
Cc: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 block/mq-deadline.c | 24 ++++++++++++++++++------
 1 file changed, 18 insertions(+), 6 deletions(-)

diff --git a/block/mq-deadline.c b/block/mq-deadline.c
index 02a916ba62ee..1f4124dd4a0b 100644
--- a/block/mq-deadline.c
+++ b/block/mq-deadline.c
@@ -338,6 +338,16 @@ static struct request *deadline_skip_seq_writes(struct deadline_data *dd,
 	return rq;
 }
 
+/*
+ * Use write locking if either QUEUE_FLAG_NO_ZONE_WRITE_LOCK has not been set.
+ * Not using zone write locking is only safe if the block driver preserves the
+ * request order.
+ */
+static bool dd_use_zone_write_locking(struct request_queue *q)
+{
+	return blk_queue_is_zoned(q) && !blk_queue_no_zone_write_lock(q);
+}
+
 /*
  * For the specified data direction, return the next request to
  * dispatch using arrival ordered lists.
@@ -353,7 +363,7 @@ deadline_fifo_request(struct deadline_data *dd, struct dd_per_prio *per_prio,
 		return NULL;
 
 	rq = rq_entry_fifo(per_prio->fifo_list[data_dir].next);
-	if (data_dir == DD_READ || !blk_queue_is_zoned(rq->q))
+	if (data_dir == DD_READ || !dd_use_zone_write_locking(rq->q))
 		return rq;
 
 	/*
@@ -398,7 +408,7 @@ deadline_next_request(struct deadline_data *dd, struct dd_per_prio *per_prio,
 	if (!rq)
 		return NULL;
 
-	if (data_dir == DD_READ || !blk_queue_is_zoned(rq->q))
+	if (data_dir == DD_READ || !dd_use_zone_write_locking(rq->q))
 		return rq;
 
 	/*
@@ -526,8 +536,9 @@ static struct request *__dd_dispatch_request(struct deadline_data *dd,
 	}
 
 	/*
-	 * For a zoned block device, if we only have writes queued and none of
-	 * them can be dispatched, rq will be NULL.
+	 * For a zoned block device that requires write serialization, if we
+	 * only have writes queued and none of them can be dispatched, rq will
+	 * be NULL.
 	 */
 	if (!rq)
 		return NULL;
@@ -552,7 +563,8 @@ static struct request *__dd_dispatch_request(struct deadline_data *dd,
 	/*
 	 * If the request needs its target zone locked, do it.
 	 */
-	blk_req_zone_write_lock(rq);
+	if (dd_use_zone_write_locking(rq->q))
+		blk_req_zone_write_lock(rq);
 	rq->rq_flags |= RQF_STARTED;
 	return rq;
 }
@@ -933,7 +945,7 @@ static void dd_finish_request(struct request *rq)
 
 	atomic_inc(&per_prio->stats.completed);
 
-	if (blk_queue_is_zoned(q)) {
+	if (dd_use_zone_write_locking(rq->q)) {
 		unsigned long flags;
 
 		spin_lock_irqsave(&dd->zone_lock, flags);

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v5 3/7] scsi: core: Retry unaligned zoned writes
  2023-07-31 22:14 [PATCH v5 0/7] Improve the performance for zoned UFS devices Bart Van Assche
  2023-07-31 22:14 ` [PATCH v5 1/7] block: Introduce the flag QUEUE_FLAG_NO_ZONE_WRITE_LOCK Bart Van Assche
  2023-07-31 22:14 ` [PATCH v5 2/7] block/mq-deadline: Only use zone locking if necessary Bart Van Assche
@ 2023-07-31 22:14 ` Bart Van Assche
  2023-08-02  9:12   ` Damien Le Moal
                     ` (2 more replies)
  2023-07-31 22:14 ` [PATCH v5 4/7] scsi: scsi_debug: Support disabling zone write locking Bart Van Assche
                   ` (3 subsequent siblings)
  6 siblings, 3 replies; 17+ messages in thread
From: Bart Van Assche @ 2023-07-31 22:14 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Christoph Hellwig, Martin K . Petersen,
	Bart Van Assche, Damien Le Moal, Ming Lei, James E.J. Bottomley

If zoned writes (REQ_OP_WRITE) for a sequential write required zone have
a starting LBA that differs from the write pointer, e.g. because zoned
writes have been reordered, then the storage device will respond with an
UNALIGNED WRITE COMMAND error. Send commands that failed with an
unaligned write error to the SCSI error handler if zone write locking is
disabled. Let the SCSI error handler sort SCSI commands per LBA before
resubmitting these.

If zone write locking is disabled, increase the number of retries for
write commands sent to a sequential zone to the maximum number of
outstanding commands because in the worst case the number of times
reordered zoned writes have to be retried is (number of outstanding
writes per sequential zone) - 1.

Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Damien Le Moal <dlemoal@kernel.org>
Cc: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/scsi/scsi_error.c | 37 +++++++++++++++++++++++++++++++++++++
 drivers/scsi/scsi_lib.c   |  1 +
 drivers/scsi/sd.c         |  3 +++
 include/scsi/scsi.h       |  1 +
 4 files changed, 42 insertions(+)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index c67cdcdc3ba8..24a4e49215af 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -27,6 +27,7 @@
 #include <linux/blkdev.h>
 #include <linux/delay.h>
 #include <linux/jiffies.h>
+#include <linux/list_sort.h>
 
 #include <scsi/scsi.h>
 #include <scsi/scsi_cmnd.h>
@@ -698,6 +699,16 @@ enum scsi_disposition scsi_check_sense(struct scsi_cmnd *scmd)
 		fallthrough;
 
 	case ILLEGAL_REQUEST:
+		/*
+		 * Unaligned write command. This indicates that zoned writes
+		 * have been received by the device in the wrong order. If zone
+		 * write locking is disabled, retry after all pending commands
+		 * have completed.
+		 */
+		if (sshdr.asc == 0x21 && sshdr.ascq == 0x04 &&
+		    blk_queue_no_zone_write_lock(scsi_cmd_to_rq(scmd)->q))
+			return NEEDS_DELAYED_RETRY;
+
 		if (sshdr.asc == 0x20 || /* Invalid command operation code */
 		    sshdr.asc == 0x21 || /* Logical block address out of range */
 		    sshdr.asc == 0x22 || /* Invalid function */
@@ -2186,6 +2197,25 @@ void scsi_eh_ready_devs(struct Scsi_Host *shost,
 }
 EXPORT_SYMBOL_GPL(scsi_eh_ready_devs);
 
+/*
+ * Returns a negative value if @_a has a lower LBA than @_b, zero if
+ * both have the same LBA and a positive value otherwise.
+ */
+static int scsi_cmp_lba(void *priv, const struct list_head *_a,
+			const struct list_head *_b)
+{
+	struct scsi_cmnd *a = list_entry(_a, typeof(*a), eh_entry);
+	struct scsi_cmnd *b = list_entry(_b, typeof(*b), eh_entry);
+	const sector_t pos_a = blk_rq_pos(scsi_cmd_to_rq(a));
+	const sector_t pos_b = blk_rq_pos(scsi_cmd_to_rq(b));
+
+	if (pos_a < pos_b)
+		return -1;
+	if (pos_a > pos_b)
+		return 1;
+	return 0;
+}
+
 /**
  * scsi_eh_flush_done_q - finish processed commands or retry them.
  * @done_q:	list_head of processed commands.
@@ -2194,6 +2224,13 @@ void scsi_eh_flush_done_q(struct list_head *done_q)
 {
 	struct scsi_cmnd *scmd, *next;
 
+	/*
+	 * Sort pending SCSI commands in LBA order. This is important if one of
+	 * the SCSI devices associated with @shost is a zoned block device for
+	 * which zone write locking is disabled.
+	 */
+	list_sort(NULL, done_q, scsi_cmp_lba);
+
 	list_for_each_entry_safe(scmd, next, done_q, eh_entry) {
 		list_del_init(&scmd->eh_entry);
 		if (scsi_device_online(scmd->device) &&
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 59176946ab56..69da8aee13df 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1443,6 +1443,7 @@ static void scsi_complete(struct request *rq)
 	case ADD_TO_MLQUEUE:
 		scsi_queue_insert(cmd, SCSI_MLQUEUE_DEVICE_BUSY);
 		break;
+	case NEEDS_DELAYED_RETRY:
 	default:
 		scsi_eh_scmd_add(cmd);
 		break;
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 68b12afa0721..27b9ebe05b90 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -1235,6 +1235,9 @@ static blk_status_t sd_setup_read_write_cmnd(struct scsi_cmnd *cmd)
 	cmd->transfersize = sdp->sector_size;
 	cmd->underflow = nr_blocks << 9;
 	cmd->allowed = sdkp->max_retries;
+	if (blk_queue_no_zone_write_lock(rq->q) &&
+	    blk_rq_is_seq_zoned_write(rq))
+		cmd->allowed += rq->q->nr_requests;
 	cmd->sdb.length = nr_blocks * sdp->sector_size;
 
 	SCSI_LOG_HLQUEUE(1,
diff --git a/include/scsi/scsi.h b/include/scsi/scsi.h
index ec093594ba53..6600db046227 100644
--- a/include/scsi/scsi.h
+++ b/include/scsi/scsi.h
@@ -93,6 +93,7 @@ static inline int scsi_status_is_check_condition(int status)
  * Internal return values.
  */
 enum scsi_disposition {
+	NEEDS_DELAYED_RETRY	= 0x2000,
 	NEEDS_RETRY		= 0x2001,
 	SUCCESS			= 0x2002,
 	FAILED			= 0x2003,

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v5 4/7] scsi: scsi_debug: Support disabling zone write locking
  2023-07-31 22:14 [PATCH v5 0/7] Improve the performance for zoned UFS devices Bart Van Assche
                   ` (2 preceding siblings ...)
  2023-07-31 22:14 ` [PATCH v5 3/7] scsi: core: Retry unaligned zoned writes Bart Van Assche
@ 2023-07-31 22:14 ` Bart Van Assche
  2023-08-07 18:13   ` Douglas Gilbert
  2023-07-31 22:14 ` [PATCH v5 5/7] scsi: scsi_debug: Support injecting unaligned write errors Bart Van Assche
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 17+ messages in thread
From: Bart Van Assche @ 2023-07-31 22:14 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Christoph Hellwig, Martin K . Petersen,
	Bart Van Assche, Douglas Gilbert, James E.J. Bottomley

Make it easier to test handling of QUEUE_FLAG_NO_ZONE_WRITE_LOCK by
adding support for setting this flag for scsi_debug request queues.

Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/scsi/scsi_debug.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 9c0af50501f9..57c6242bfb26 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -832,6 +832,7 @@ static int dix_reads;
 static int dif_errors;
 
 /* ZBC global data */
+static bool sdeb_no_zwrl;
 static bool sdeb_zbc_in_use;	/* true for host-aware and host-managed disks */
 static int sdeb_zbc_zone_cap_mb;
 static int sdeb_zbc_zone_size_mb;
@@ -5138,9 +5139,13 @@ static struct sdebug_dev_info *find_build_dev_info(struct scsi_device *sdev)
 
 static int scsi_debug_slave_alloc(struct scsi_device *sdp)
 {
+	struct request_queue *q = sdp->request_queue;
+
 	if (sdebug_verbose)
 		pr_info("slave_alloc <%u %u %u %llu>\n",
 		       sdp->host->host_no, sdp->channel, sdp->id, sdp->lun);
+	if (sdeb_no_zwrl)
+		blk_queue_flag_set(QUEUE_FLAG_NO_ZONE_WRITE_LOCK, q);
 	return 0;
 }
 
@@ -5738,6 +5743,7 @@ module_param_named(ndelay, sdebug_ndelay, int, S_IRUGO | S_IWUSR);
 module_param_named(no_lun_0, sdebug_no_lun_0, int, S_IRUGO | S_IWUSR);
 module_param_named(no_rwlock, sdebug_no_rwlock, bool, S_IRUGO | S_IWUSR);
 module_param_named(no_uld, sdebug_no_uld, int, S_IRUGO);
+module_param_named(no_zone_write_lock, sdeb_no_zwrl, bool, S_IRUGO);
 module_param_named(num_parts, sdebug_num_parts, int, S_IRUGO);
 module_param_named(num_tgts, sdebug_num_tgts, int, S_IRUGO | S_IWUSR);
 module_param_named(opt_blks, sdebug_opt_blks, int, S_IRUGO);
@@ -5812,6 +5818,8 @@ MODULE_PARM_DESC(ndelay, "response delay in nanoseconds (def=0 -> ignore)");
 MODULE_PARM_DESC(no_lun_0, "no LU number 0 (def=0 -> have lun 0)");
 MODULE_PARM_DESC(no_rwlock, "don't protect user data reads+writes (def=0)");
 MODULE_PARM_DESC(no_uld, "stop ULD (e.g. sd driver) attaching (def=0))");
+MODULE_PARM_DESC(no_zone_write_lock,
+		 "Disable serialization of zoned writes (def=0)");
 MODULE_PARM_DESC(num_parts, "number of partitions(def=0)");
 MODULE_PARM_DESC(num_tgts, "number of targets per host to simulate(def=1)");
 MODULE_PARM_DESC(opt_blks, "optimal transfer length in blocks (def=1024)");

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v5 5/7] scsi: scsi_debug: Support injecting unaligned write errors
  2023-07-31 22:14 [PATCH v5 0/7] Improve the performance for zoned UFS devices Bart Van Assche
                   ` (3 preceding siblings ...)
  2023-07-31 22:14 ` [PATCH v5 4/7] scsi: scsi_debug: Support disabling zone write locking Bart Van Assche
@ 2023-07-31 22:14 ` Bart Van Assche
  2023-08-07 18:13   ` Douglas Gilbert
  2023-07-31 22:14 ` [PATCH v5 6/7] scsi: ufs: Split an if-condition Bart Van Assche
  2023-07-31 22:14 ` [PATCH v5 7/7] scsi: ufs: Disable zone write locking Bart Van Assche
  6 siblings, 1 reply; 17+ messages in thread
From: Bart Van Assche @ 2023-07-31 22:14 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Christoph Hellwig, Martin K . Petersen,
	Bart Van Assche, Douglas Gilbert, James E.J. Bottomley

Allow user space software, e.g. a blktests test, to inject unaligned
write errors.

Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/scsi/scsi_debug.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 57c6242bfb26..051b0605f11f 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -181,6 +181,7 @@ static const char *sdebug_version_date = "20210520";
 #define SDEBUG_OPT_NO_CDB_NOISE		0x4000
 #define SDEBUG_OPT_HOST_BUSY		0x8000
 #define SDEBUG_OPT_CMD_ABORT		0x10000
+#define SDEBUG_OPT_UNALIGNED_WRITE	0x20000
 #define SDEBUG_OPT_ALL_NOISE (SDEBUG_OPT_NOISE | SDEBUG_OPT_Q_NOISE | \
 			      SDEBUG_OPT_RESET_NOISE)
 #define SDEBUG_OPT_ALL_INJECTING (SDEBUG_OPT_RECOVERED_ERR | \
@@ -188,7 +189,8 @@ static const char *sdebug_version_date = "20210520";
 				  SDEBUG_OPT_DIF_ERR | SDEBUG_OPT_DIX_ERR | \
 				  SDEBUG_OPT_SHORT_TRANSFER | \
 				  SDEBUG_OPT_HOST_BUSY | \
-				  SDEBUG_OPT_CMD_ABORT)
+				  SDEBUG_OPT_CMD_ABORT | \
+				  SDEBUG_OPT_UNALIGNED_WRITE)
 #define SDEBUG_OPT_RECOV_DIF_DIX (SDEBUG_OPT_RECOVERED_ERR | \
 				  SDEBUG_OPT_DIF_ERR | SDEBUG_OPT_DIX_ERR)
 
@@ -3587,6 +3589,14 @@ static int resp_write_dt0(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
 	struct sdeb_store_info *sip = devip2sip(devip, true);
 	u8 *cmd = scp->cmnd;
 
+	if (unlikely(sdebug_opts & SDEBUG_OPT_UNALIGNED_WRITE &&
+		     atomic_read(&sdeb_inject_pending))) {
+		atomic_set(&sdeb_inject_pending, 0);
+		mk_sense_buffer(scp, ILLEGAL_REQUEST, LBA_OUT_OF_RANGE,
+				UNALIGNED_WRITE_ASCQ);
+		return check_condition_result;
+	}
+
 	switch (cmd[0]) {
 	case WRITE_16:
 		ei_lba = 0;

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v5 6/7] scsi: ufs: Split an if-condition
  2023-07-31 22:14 [PATCH v5 0/7] Improve the performance for zoned UFS devices Bart Van Assche
                   ` (4 preceding siblings ...)
  2023-07-31 22:14 ` [PATCH v5 5/7] scsi: scsi_debug: Support injecting unaligned write errors Bart Van Assche
@ 2023-07-31 22:14 ` Bart Van Assche
  2023-07-31 22:14 ` [PATCH v5 7/7] scsi: ufs: Disable zone write locking Bart Van Assche
  6 siblings, 0 replies; 17+ messages in thread
From: Bart Van Assche @ 2023-07-31 22:14 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Christoph Hellwig, Martin K . Petersen,
	Bart Van Assche, Avri Altman, Damien Le Moal, Ming Lei,
	James E.J. Bottomley, Stanley Chu, Can Guo, Asutosh Das,
	Bao D. Nguyen, Bean Huo, Arthur Simchaev

Make the next patch in this series easier to read. No functionality is
changed.

Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Avri Altman <avri.altman@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Damien Le Moal <dlemoal@kernel.org>
Cc: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/ufs/core/ufshcd.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index 129446775796..ae7b868f9c26 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -4352,8 +4352,9 @@ void ufshcd_auto_hibern8_update(struct ufs_hba *hba, u32 ahit)
 	}
 	spin_unlock_irqrestore(hba->host->host_lock, flags);
 
-	if (update &&
-	    !pm_runtime_suspended(&hba->ufs_device_wlun->sdev_gendev)) {
+	if (!update)
+		return;
+	if (!pm_runtime_suspended(&hba->ufs_device_wlun->sdev_gendev)) {
 		ufshcd_rpm_get_sync(hba);
 		ufshcd_hold(hba);
 		ufshcd_auto_hibern8_enable(hba);

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v5 7/7] scsi: ufs: Disable zone write locking
  2023-07-31 22:14 [PATCH v5 0/7] Improve the performance for zoned UFS devices Bart Van Assche
                   ` (5 preceding siblings ...)
  2023-07-31 22:14 ` [PATCH v5 6/7] scsi: ufs: Split an if-condition Bart Van Assche
@ 2023-07-31 22:14 ` Bart Van Assche
  6 siblings, 0 replies; 17+ messages in thread
From: Bart Van Assche @ 2023-07-31 22:14 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Christoph Hellwig, Martin K . Petersen,
	Bart Van Assche, Avri Altman, Damien Le Moal, Ming Lei,
	James E.J. Bottomley, Stanley Chu, Can Guo, Bean Huo,
	Asutosh Das, Bao D. Nguyen, Arthur Simchaev

From the UFSHCI 4.0 specification, about the legacy (single queue) mode:
"The host controller always process transfer requests in-order according
to the order submitted to the list. In case of multiple commands with
single doorbell register ringing (batch mode), The dispatch order for
these transfer requests by host controller will base on their index in
the List. A transfer request with lower index value will be executed
before a transfer request with higher index value."

From the UFSHCI 4.0 specification, about the MCQ mode:
"Command Submission
1. Host SW writes an Entry to SQ
2. Host SW updates SQ doorbell tail pointer

Command Processing
3. After fetching the Entry, Host Controller updates SQ doorbell head
   pointer
4. Host controller sends COMMAND UPIU to UFS device"

In other words, for both legacy and MCQ mode, UFS controllers are
required to forward commands to the UFS device in the order these
commands have been received from the host.

Notes:
- For legacy mode this is only correct if the host submits one
  command at a time. The UFS driver does this.
- Also in legacy mode, the command order is not preserved if
  auto-hibernation is enabled in the UFS controller. Hence, enable
  zone write locking if auto-hibernation is enabled.

This patch improves performance as follows on my test setup:
- With the mq-deadline scheduler: 2.5x more IOPS for small writes.
- When not using an I/O scheduler compared to using mq-deadline with
  zone locking: 4x more IOPS for small writes.

Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Avri Altman <avri.altman@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Damien Le Moal <dlemoal@kernel.org>
Cc: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/ufs/core/ufshcd.c | 40 ++++++++++++++++++++++++++++++++++++++-
 1 file changed, 39 insertions(+), 1 deletion(-)

diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index ae7b868f9c26..3c99516c38fa 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -4337,23 +4337,53 @@ int ufshcd_uic_hibern8_exit(struct ufs_hba *hba)
 }
 EXPORT_SYMBOL_GPL(ufshcd_uic_hibern8_exit);
 
+static void ufshcd_update_no_zone_write_lock(struct ufs_hba *hba,
+					     bool set_no_zone_write_lock)
+{
+	struct scsi_device *sdev;
+
+	shost_for_each_device(sdev, hba->host)
+		blk_freeze_queue_start(sdev->request_queue);
+	shost_for_each_device(sdev, hba->host) {
+		struct request_queue *q = sdev->request_queue;
+
+		blk_mq_freeze_queue_wait(q);
+		if (set_no_zone_write_lock)
+			blk_queue_flag_set(QUEUE_FLAG_NO_ZONE_WRITE_LOCK, q);
+		else
+			blk_queue_flag_clear(QUEUE_FLAG_NO_ZONE_WRITE_LOCK, q);
+		blk_mq_unfreeze_queue(q);
+	}
+}
+
 void ufshcd_auto_hibern8_update(struct ufs_hba *hba, u32 ahit)
 {
 	unsigned long flags;
-	bool update = false;
+	bool prev_state, new_state, update = false;
 
 	if (!ufshcd_is_auto_hibern8_supported(hba))
 		return;
 
 	spin_lock_irqsave(hba->host->host_lock, flags);
+	prev_state = ufshcd_is_auto_hibern8_enabled(hba);
 	if (hba->ahit != ahit) {
 		hba->ahit = ahit;
 		update = true;
 	}
+	new_state = ufshcd_is_auto_hibern8_enabled(hba);
 	spin_unlock_irqrestore(hba->host->host_lock, flags);
 
 	if (!update)
 		return;
+	if (!is_mcq_enabled(hba) && !prev_state && new_state) {
+		/*
+		 * Auto-hibernation will be enabled. Enable write locking for
+		 * zoned writes since auto-hibernation may cause reordering of
+		 * zoned writes when using the legacy mode of the UFS host
+		 * controller.
+		 */
+		ufshcd_update_no_zone_write_lock(hba, false);
+	}
 	if (!pm_runtime_suspended(&hba->ufs_device_wlun->sdev_gendev)) {
 		ufshcd_rpm_get_sync(hba);
 		ufshcd_hold(hba);
@@ -4361,6 +4391,13 @@ void ufshcd_auto_hibern8_update(struct ufs_hba *hba, u32 ahit)
 		ufshcd_release(hba);
 		ufshcd_rpm_put_sync(hba);
 	}
+	if (!is_mcq_enabled(hba) && prev_state && !new_state) {
+		/*
+		 * Auto-hibernation has been disabled. Disable write locking
+		 * for zoned writes.
+		 */
+		ufshcd_update_no_zone_write_lock(hba, true);
+	}
 }
 EXPORT_SYMBOL_GPL(ufshcd_auto_hibern8_update);
 
@@ -5140,6 +5177,7 @@ static int ufshcd_slave_configure(struct scsi_device *sdev)
 
 	ufshcd_hpb_configure(hba, sdev);
 
+	blk_queue_flag_set(QUEUE_FLAG_NO_ZONE_WRITE_LOCK, q);
 	blk_queue_update_dma_pad(q, PRDT_DATA_BYTE_COUNT_PAD - 1);
 	if (hba->quirks & UFSHCD_QUIRK_4KB_DMA_ALIGNMENT)
 		blk_queue_update_dma_alignment(q, SZ_4K - 1);

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v5 1/7] block: Introduce the flag QUEUE_FLAG_NO_ZONE_WRITE_LOCK
  2023-07-31 22:14 ` [PATCH v5 1/7] block: Introduce the flag QUEUE_FLAG_NO_ZONE_WRITE_LOCK Bart Van Assche
@ 2023-08-02  9:12   ` Damien Le Moal
  0 siblings, 0 replies; 17+ messages in thread
From: Damien Le Moal @ 2023-08-02  9:12 UTC (permalink / raw)
  To: Bart Van Assche, Jens Axboe
  Cc: linux-block, Christoph Hellwig, Martin K . Petersen, Ming Lei

On 8/1/23 07:14, Bart Van Assche wrote:
> Writes in sequential write required zones must happen at the write
> pointer. Even if the submitter of the write commands (e.g. a filesystem)
> submits writes for sequential write required zones in order, the block
> layer or the storage controller may reorder these write commands.
> 
> The zone locking mechanism in the mq-deadline I/O scheduler serializes
> write commands for sequential zones. Some but not all storage controllers
> require this serialization. Introduce a new request queue flag to allow
> block drivers to indicate that they preserve the order of write commands
> and thus do not require serialization of writes per zone.
> 
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Damien Le Moal <dlemoal@kernel.org>
> Cc: Ming Lei <ming.lei@redhat.com>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>

Looks OK.
Very minor nit below.

Reviewed-by: Damien Le Moal <dlemoal@kernel.org>

> ---
>  include/linux/blkdev.h | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index 2f5371b8482c..de5e05cc34fa 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -534,6 +534,11 @@ struct request_queue {
>  #define QUEUE_FLAG_NONROT	6	/* non-rotational device (SSD) */
>  #define QUEUE_FLAG_VIRT		QUEUE_FLAG_NONROT /* paravirt device */
>  #define QUEUE_FLAG_IO_STAT	7	/* do disk/partitions IO accounting */
> +/*
> + * Do not serialize sequential writes (REQ_OP_WRITE, REQ_OP_WRITE_ZEROES) sent
> + * to a sequential write required zone (BLK_ZONE_TYPE_SEQWRITE_REQ).
> + */

I would be very explicit here, for this to be clear to people who are not
familiar with zone device write operation handling. Something like:

/*
 * The device supports not using the zone write locking mechanism to serialize
 * write operations (REQ_OP_WRITE, REQ_OP_WRITE_ZEROES) issued to a sequential
 * write required zone (BLK_ZONE_TYPE_SEQWRITE_REQ).
 */

> +#define QUEUE_FLAG_NO_ZONE_WRITE_LOCK 8
>  #define QUEUE_FLAG_NOXMERGES	9	/* No extended merges */
>  #define QUEUE_FLAG_ADD_RANDOM	10	/* Contributes to random pool */
>  #define QUEUE_FLAG_SYNCHRONOUS	11	/* always completes in submit context */
> @@ -597,6 +602,11 @@ bool blk_queue_flag_test_and_set(unsigned int flag, struct request_queue *q);
>  #define blk_queue_skip_tagset_quiesce(q) \
>  	test_bit(QUEUE_FLAG_SKIP_TAGSET_QUIESCE, &(q)->queue_flags)
>  
> +static inline bool blk_queue_no_zone_write_lock(struct request_queue *q)
> +{
> +	return test_bit(QUEUE_FLAG_NO_ZONE_WRITE_LOCK, &q->queue_flags);
> +}
> +
>  extern void blk_set_pm_only(struct request_queue *q);
>  extern void blk_clear_pm_only(struct request_queue *q);
>  

-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v5 2/7] block/mq-deadline: Only use zone locking if necessary
  2023-07-31 22:14 ` [PATCH v5 2/7] block/mq-deadline: Only use zone locking if necessary Bart Van Assche
@ 2023-08-02  9:12   ` Damien Le Moal
  0 siblings, 0 replies; 17+ messages in thread
From: Damien Le Moal @ 2023-08-02  9:12 UTC (permalink / raw)
  To: Bart Van Assche, Jens Axboe
  Cc: linux-block, Christoph Hellwig, Martin K . Petersen, Ming Lei

On 8/1/23 07:14, Bart Van Assche wrote:
> Measurements have shown that limiting the queue depth to one per zone for
> zoned writes has a significant negative performance impact on zoned UFS
> devices. Hence this patch that disables zone locking by the mq-deadline
> scheduler if the storage controller preserves the command order. This
> patch is based on the following assumptions:
> - It happens infrequently that zoned write requests are reordered by the
>   block layer.
> - The I/O priority of all write requests is the same per zone.
> - Either no I/O scheduler is used or an I/O scheduler is used that
>   submits write requests per zone in LBA order.
> 
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Damien Le Moal <dlemoal@kernel.org>
> Cc: Ming Lei <ming.lei@redhat.com>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>

Looks OK.

Reviewed-by: Damien Le Moal <dlemoal@kernel.org>

> ---
>  block/mq-deadline.c | 24 ++++++++++++++++++------
>  1 file changed, 18 insertions(+), 6 deletions(-)
> 
> diff --git a/block/mq-deadline.c b/block/mq-deadline.c
> index 02a916ba62ee..1f4124dd4a0b 100644
> --- a/block/mq-deadline.c
> +++ b/block/mq-deadline.c
> @@ -338,6 +338,16 @@ static struct request *deadline_skip_seq_writes(struct deadline_data *dd,
>  	return rq;
>  }
>  
> +/*
> + * Use write locking if either QUEUE_FLAG_NO_ZONE_WRITE_LOCK has not been set.
> + * Not using zone write locking is only safe if the block driver preserves the
> + * request order.
> + */
> +static bool dd_use_zone_write_locking(struct request_queue *q)
> +{
> +	return blk_queue_is_zoned(q) && !blk_queue_no_zone_write_lock(q);
> +}
> +
>  /*
>   * For the specified data direction, return the next request to
>   * dispatch using arrival ordered lists.
> @@ -353,7 +363,7 @@ deadline_fifo_request(struct deadline_data *dd, struct dd_per_prio *per_prio,
>  		return NULL;
>  
>  	rq = rq_entry_fifo(per_prio->fifo_list[data_dir].next);
> -	if (data_dir == DD_READ || !blk_queue_is_zoned(rq->q))
> +	if (data_dir == DD_READ || !dd_use_zone_write_locking(rq->q))
>  		return rq;
>  
>  	/*
> @@ -398,7 +408,7 @@ deadline_next_request(struct deadline_data *dd, struct dd_per_prio *per_prio,
>  	if (!rq)
>  		return NULL;
>  
> -	if (data_dir == DD_READ || !blk_queue_is_zoned(rq->q))
> +	if (data_dir == DD_READ || !dd_use_zone_write_locking(rq->q))
>  		return rq;
>  
>  	/*
> @@ -526,8 +536,9 @@ static struct request *__dd_dispatch_request(struct deadline_data *dd,
>  	}
>  
>  	/*
> -	 * For a zoned block device, if we only have writes queued and none of
> -	 * them can be dispatched, rq will be NULL.
> +	 * For a zoned block device that requires write serialization, if we
> +	 * only have writes queued and none of them can be dispatched, rq will
> +	 * be NULL.
>  	 */
>  	if (!rq)
>  		return NULL;
> @@ -552,7 +563,8 @@ static struct request *__dd_dispatch_request(struct deadline_data *dd,
>  	/*
>  	 * If the request needs its target zone locked, do it.
>  	 */
> -	blk_req_zone_write_lock(rq);
> +	if (dd_use_zone_write_locking(rq->q))
> +		blk_req_zone_write_lock(rq);
>  	rq->rq_flags |= RQF_STARTED;
>  	return rq;
>  }
> @@ -933,7 +945,7 @@ static void dd_finish_request(struct request *rq)
>  
>  	atomic_inc(&per_prio->stats.completed);
>  
> -	if (blk_queue_is_zoned(q)) {
> +	if (dd_use_zone_write_locking(rq->q)) {
>  		unsigned long flags;
>  
>  		spin_lock_irqsave(&dd->zone_lock, flags);

-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v5 3/7] scsi: core: Retry unaligned zoned writes
  2023-07-31 22:14 ` [PATCH v5 3/7] scsi: core: Retry unaligned zoned writes Bart Van Assche
@ 2023-08-02  9:12   ` Damien Le Moal
  2023-08-02  9:16   ` Hannes Reinecke
  2023-08-02 11:52   ` Christoph Hellwig
  2 siblings, 0 replies; 17+ messages in thread
From: Damien Le Moal @ 2023-08-02  9:12 UTC (permalink / raw)
  To: Bart Van Assche, Jens Axboe
  Cc: linux-block, Christoph Hellwig, Martin K . Petersen, Ming Lei,
	James E.J. Bottomley

On 8/1/23 07:14, Bart Van Assche wrote:
> If zoned writes (REQ_OP_WRITE) for a sequential write required zone have
> a starting LBA that differs from the write pointer, e.g. because zoned
> writes have been reordered, then the storage device will respond with an
> UNALIGNED WRITE COMMAND error. Send commands that failed with an
> unaligned write error to the SCSI error handler if zone write locking is
> disabled. Let the SCSI error handler sort SCSI commands per LBA before
> resubmitting these.
> 
> If zone write locking is disabled, increase the number of retries for
> write commands sent to a sequential zone to the maximum number of
> outstanding commands because in the worst case the number of times
> reordered zoned writes have to be retried is (number of outstanding
> writes per sequential zone) - 1.
> 
> Cc: Martin K. Petersen <martin.petersen@oracle.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Damien Le Moal <dlemoal@kernel.org>
> Cc: Ming Lei <ming.lei@redhat.com>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>

Looks OK.

Reviewed-by: Damien Le Moal <dlemoal@kernel.org>

-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v5 3/7] scsi: core: Retry unaligned zoned writes
  2023-07-31 22:14 ` [PATCH v5 3/7] scsi: core: Retry unaligned zoned writes Bart Van Assche
  2023-08-02  9:12   ` Damien Le Moal
@ 2023-08-02  9:16   ` Hannes Reinecke
  2023-08-02  9:31     ` Damien Le Moal
  2023-08-02 11:52   ` Christoph Hellwig
  2 siblings, 1 reply; 17+ messages in thread
From: Hannes Reinecke @ 2023-08-02  9:16 UTC (permalink / raw)
  To: Bart Van Assche, Jens Axboe
  Cc: linux-block, Christoph Hellwig, Martin K . Petersen,
	Damien Le Moal, Ming Lei, James E.J. Bottomley

On 8/1/23 00:14, Bart Van Assche wrote:
> If zoned writes (REQ_OP_WRITE) for a sequential write required zone have
> a starting LBA that differs from the write pointer, e.g. because zoned
> writes have been reordered, then the storage device will respond with an
> UNALIGNED WRITE COMMAND error. Send commands that failed with an
> unaligned write error to the SCSI error handler if zone write locking is
> disabled. Let the SCSI error handler sort SCSI commands per LBA before
> resubmitting these.
> 
> If zone write locking is disabled, increase the number of retries for
> write commands sent to a sequential zone to the maximum number of
> outstanding commands because in the worst case the number of times
> reordered zoned writes have to be retried is (number of outstanding
> writes per sequential zone) - 1.
> 
> Cc: Martin K. Petersen <martin.petersen@oracle.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Damien Le Moal <dlemoal@kernel.org>
> Cc: Ming Lei <ming.lei@redhat.com>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
>   drivers/scsi/scsi_error.c | 37 +++++++++++++++++++++++++++++++++++++
>   drivers/scsi/scsi_lib.c   |  1 +
>   drivers/scsi/sd.c         |  3 +++
>   include/scsi/scsi.h       |  1 +
>   4 files changed, 42 insertions(+)
> 
> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> index c67cdcdc3ba8..24a4e49215af 100644
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -27,6 +27,7 @@
>   #include <linux/blkdev.h>
>   #include <linux/delay.h>
>   #include <linux/jiffies.h>
> +#include <linux/list_sort.h>
>   
>   #include <scsi/scsi.h>
>   #include <scsi/scsi_cmnd.h>
> @@ -698,6 +699,16 @@ enum scsi_disposition scsi_check_sense(struct scsi_cmnd *scmd)
>   		fallthrough;
>   
>   	case ILLEGAL_REQUEST:
> +		/*
> +		 * Unaligned write command. This indicates that zoned writes
> +		 * have been received by the device in the wrong order. If zone
> +		 * write locking is disabled, retry after all pending commands
> +		 * have completed.
> +		 */
> +		if (sshdr.asc == 0x21 && sshdr.ascq == 0x04 &&
> +		    blk_queue_no_zone_write_lock(scsi_cmd_to_rq(scmd)->q))
> +			return NEEDS_DELAYED_RETRY;
> +
>   		if (sshdr.asc == 0x20 || /* Invalid command operation code */
>   		    sshdr.asc == 0x21 || /* Logical block address out of range */
>   		    sshdr.asc == 0x22 || /* Invalid function */
> @@ -2186,6 +2197,25 @@ void scsi_eh_ready_devs(struct Scsi_Host *shost,
>   }
>   EXPORT_SYMBOL_GPL(scsi_eh_ready_devs);
>   
> +/*
> + * Returns a negative value if @_a has a lower LBA than @_b, zero if
> + * both have the same LBA and a positive value otherwise.
> + */
> +static int scsi_cmp_lba(void *priv, const struct list_head *_a,
> +			const struct list_head *_b)
> +{
> +	struct scsi_cmnd *a = list_entry(_a, typeof(*a), eh_entry);
> +	struct scsi_cmnd *b = list_entry(_b, typeof(*b), eh_entry);
> +	const sector_t pos_a = blk_rq_pos(scsi_cmd_to_rq(a));
> +	const sector_t pos_b = blk_rq_pos(scsi_cmd_to_rq(b));
> +
> +	if (pos_a < pos_b)
> +		return -1;
> +	if (pos_a > pos_b)
> +		return 1;
> +	return 0;
> +}
> +
>   /**
>    * scsi_eh_flush_done_q - finish processed commands or retry them.
>    * @done_q:	list_head of processed commands.
> @@ -2194,6 +2224,13 @@ void scsi_eh_flush_done_q(struct list_head *done_q)
>   {
>   	struct scsi_cmnd *scmd, *next;
>   
> +	/*
> +	 * Sort pending SCSI commands in LBA order. This is important if one of
> +	 * the SCSI devices associated with @shost is a zoned block device for
> +	 * which zone write locking is disabled.
> +	 */
> +	list_sort(NULL, done_q, scsi_cmp_lba);
> +
>   	list_for_each_entry_safe(scmd, next, done_q, eh_entry) {
>   		list_del_init(&scmd->eh_entry);
>   		if (scsi_device_online(scmd->device) &&
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 59176946ab56..69da8aee13df 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -1443,6 +1443,7 @@ static void scsi_complete(struct request *rq)
>   	case ADD_TO_MLQUEUE:
>   		scsi_queue_insert(cmd, SCSI_MLQUEUE_DEVICE_BUSY);
>   		break;
> +	case NEEDS_DELAYED_RETRY:
>   	default:
>   		scsi_eh_scmd_add(cmd);
>   		break;
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index 68b12afa0721..27b9ebe05b90 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -1235,6 +1235,9 @@ static blk_status_t sd_setup_read_write_cmnd(struct scsi_cmnd *cmd)
>   	cmd->transfersize = sdp->sector_size;
>   	cmd->underflow = nr_blocks << 9;
>   	cmd->allowed = sdkp->max_retries;
> +	if (blk_queue_no_zone_write_lock(rq->q) &&
> +	    blk_rq_is_seq_zoned_write(rq))
> +		cmd->allowed += rq->q->nr_requests;
>   	cmd->sdb.length = nr_blocks * sdp->sector_size;
>   
>   	SCSI_LOG_HLQUEUE(1,
> diff --git a/include/scsi/scsi.h b/include/scsi/scsi.h
> index ec093594ba53..6600db046227 100644
> --- a/include/scsi/scsi.h
> +++ b/include/scsi/scsi.h
> @@ -93,6 +93,7 @@ static inline int scsi_status_is_check_condition(int status)
>    * Internal return values.
>    */
>   enum scsi_disposition {
> +	NEEDS_DELAYED_RETRY	= 0x2000,
>   	NEEDS_RETRY		= 0x2001,
>   	SUCCESS			= 0x2002,
>   	FAILED			= 0x2003,
What happens for a 'real' unaligned write, ie a command whose starting 
LBA is before the write pointer? Wouldn't we incur additional retries?
Do we terminate at all?

Cheers,

Hannes


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v5 3/7] scsi: core: Retry unaligned zoned writes
  2023-08-02  9:16   ` Hannes Reinecke
@ 2023-08-02  9:31     ` Damien Le Moal
  0 siblings, 0 replies; 17+ messages in thread
From: Damien Le Moal @ 2023-08-02  9:31 UTC (permalink / raw)
  To: Hannes Reinecke, Bart Van Assche, Jens Axboe
  Cc: linux-block, Christoph Hellwig, Martin K . Petersen, Ming Lei,
	James E.J. Bottomley

On 8/2/23 18:16, Hannes Reinecke wrote:
> On 8/1/23 00:14, Bart Van Assche wrote:
>> If zoned writes (REQ_OP_WRITE) for a sequential write required zone have
>> a starting LBA that differs from the write pointer, e.g. because zoned
>> writes have been reordered, then the storage device will respond with an
>> UNALIGNED WRITE COMMAND error. Send commands that failed with an
>> unaligned write error to the SCSI error handler if zone write locking is
>> disabled. Let the SCSI error handler sort SCSI commands per LBA before
>> resubmitting these.
>>
>> If zone write locking is disabled, increase the number of retries for
>> write commands sent to a sequential zone to the maximum number of
>> outstanding commands because in the worst case the number of times
>> reordered zoned writes have to be retried is (number of outstanding
>> writes per sequential zone) - 1.
>>
>> Cc: Martin K. Petersen <martin.petersen@oracle.com>
>> Cc: Christoph Hellwig <hch@lst.de>
>> Cc: Damien Le Moal <dlemoal@kernel.org>
>> Cc: Ming Lei <ming.lei@redhat.com>
>> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
>> ---
>>   drivers/scsi/scsi_error.c | 37 +++++++++++++++++++++++++++++++++++++
>>   drivers/scsi/scsi_lib.c   |  1 +
>>   drivers/scsi/sd.c         |  3 +++
>>   include/scsi/scsi.h       |  1 +
>>   4 files changed, 42 insertions(+)
>>
>> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
>> index c67cdcdc3ba8..24a4e49215af 100644
>> --- a/drivers/scsi/scsi_error.c
>> +++ b/drivers/scsi/scsi_error.c
>> @@ -27,6 +27,7 @@
>>   #include <linux/blkdev.h>
>>   #include <linux/delay.h>
>>   #include <linux/jiffies.h>
>> +#include <linux/list_sort.h>
>>   
>>   #include <scsi/scsi.h>
>>   #include <scsi/scsi_cmnd.h>
>> @@ -698,6 +699,16 @@ enum scsi_disposition scsi_check_sense(struct scsi_cmnd *scmd)
>>   		fallthrough;
>>   
>>   	case ILLEGAL_REQUEST:
>> +		/*
>> +		 * Unaligned write command. This indicates that zoned writes
>> +		 * have been received by the device in the wrong order. If zone
>> +		 * write locking is disabled, retry after all pending commands
>> +		 * have completed.
>> +		 */
>> +		if (sshdr.asc == 0x21 && sshdr.ascq == 0x04 &&
>> +		    blk_queue_no_zone_write_lock(scsi_cmd_to_rq(scmd)->q))
>> +			return NEEDS_DELAYED_RETRY;
>> +
>>   		if (sshdr.asc == 0x20 || /* Invalid command operation code */
>>   		    sshdr.asc == 0x21 || /* Logical block address out of range */
>>   		    sshdr.asc == 0x22 || /* Invalid function */
>> @@ -2186,6 +2197,25 @@ void scsi_eh_ready_devs(struct Scsi_Host *shost,
>>   }
>>   EXPORT_SYMBOL_GPL(scsi_eh_ready_devs);
>>   
>> +/*
>> + * Returns a negative value if @_a has a lower LBA than @_b, zero if
>> + * both have the same LBA and a positive value otherwise.
>> + */
>> +static int scsi_cmp_lba(void *priv, const struct list_head *_a,
>> +			const struct list_head *_b)
>> +{
>> +	struct scsi_cmnd *a = list_entry(_a, typeof(*a), eh_entry);
>> +	struct scsi_cmnd *b = list_entry(_b, typeof(*b), eh_entry);
>> +	const sector_t pos_a = blk_rq_pos(scsi_cmd_to_rq(a));
>> +	const sector_t pos_b = blk_rq_pos(scsi_cmd_to_rq(b));
>> +
>> +	if (pos_a < pos_b)
>> +		return -1;
>> +	if (pos_a > pos_b)
>> +		return 1;
>> +	return 0;
>> +}
>> +
>>   /**
>>    * scsi_eh_flush_done_q - finish processed commands or retry them.
>>    * @done_q:	list_head of processed commands.
>> @@ -2194,6 +2224,13 @@ void scsi_eh_flush_done_q(struct list_head *done_q)
>>   {
>>   	struct scsi_cmnd *scmd, *next;
>>   
>> +	/*
>> +	 * Sort pending SCSI commands in LBA order. This is important if one of
>> +	 * the SCSI devices associated with @shost is a zoned block device for
>> +	 * which zone write locking is disabled.
>> +	 */
>> +	list_sort(NULL, done_q, scsi_cmp_lba);
>> +
>>   	list_for_each_entry_safe(scmd, next, done_q, eh_entry) {
>>   		list_del_init(&scmd->eh_entry);
>>   		if (scsi_device_online(scmd->device) &&
>> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
>> index 59176946ab56..69da8aee13df 100644
>> --- a/drivers/scsi/scsi_lib.c
>> +++ b/drivers/scsi/scsi_lib.c
>> @@ -1443,6 +1443,7 @@ static void scsi_complete(struct request *rq)
>>   	case ADD_TO_MLQUEUE:
>>   		scsi_queue_insert(cmd, SCSI_MLQUEUE_DEVICE_BUSY);
>>   		break;
>> +	case NEEDS_DELAYED_RETRY:
>>   	default:
>>   		scsi_eh_scmd_add(cmd);
>>   		break;
>> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
>> index 68b12afa0721..27b9ebe05b90 100644
>> --- a/drivers/scsi/sd.c
>> +++ b/drivers/scsi/sd.c
>> @@ -1235,6 +1235,9 @@ static blk_status_t sd_setup_read_write_cmnd(struct scsi_cmnd *cmd)
>>   	cmd->transfersize = sdp->sector_size;
>>   	cmd->underflow = nr_blocks << 9;
>>   	cmd->allowed = sdkp->max_retries;
>> +	if (blk_queue_no_zone_write_lock(rq->q) &&
>> +	    blk_rq_is_seq_zoned_write(rq))
>> +		cmd->allowed += rq->q->nr_requests;
>>   	cmd->sdb.length = nr_blocks * sdp->sector_size;
>>   
>>   	SCSI_LOG_HLQUEUE(1,
>> diff --git a/include/scsi/scsi.h b/include/scsi/scsi.h
>> index ec093594ba53..6600db046227 100644
>> --- a/include/scsi/scsi.h
>> +++ b/include/scsi/scsi.h
>> @@ -93,6 +93,7 @@ static inline int scsi_status_is_check_condition(int status)
>>    * Internal return values.
>>    */
>>   enum scsi_disposition {
>> +	NEEDS_DELAYED_RETRY	= 0x2000,
>>   	NEEDS_RETRY		= 0x2001,
>>   	SUCCESS			= 0x2002,
>>   	FAILED			= 0x2003,
> What happens for a 'real' unaligned write, ie a command whose starting 
> LBA is before the write pointer? Wouldn't we incur additional retries?
> Do we terminate at all?

The buggy user issuing writes out of order will either eventually see the error
or get lucky and have its unaligned write reordered and succeeding.

> 
> Cheers,
> 
> Hannes
> 

-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v5 3/7] scsi: core: Retry unaligned zoned writes
  2023-07-31 22:14 ` [PATCH v5 3/7] scsi: core: Retry unaligned zoned writes Bart Van Assche
  2023-08-02  9:12   ` Damien Le Moal
  2023-08-02  9:16   ` Hannes Reinecke
@ 2023-08-02 11:52   ` Christoph Hellwig
  2023-08-02 17:20     ` Bart Van Assche
  2 siblings, 1 reply; 17+ messages in thread
From: Christoph Hellwig @ 2023-08-02 11:52 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Jens Axboe, linux-block, Christoph Hellwig, Martin K . Petersen,
	Damien Le Moal, Ming Lei, James E.J. Bottomley

> +/*
> + * Returns a negative value if @_a has a lower LBA than @_b, zero if
> + * both have the same LBA and a positive value otherwise.
> + */
> +static int scsi_cmp_lba(void *priv, const struct list_head *_a,
> +			const struct list_head *_b)
> +{
> +	struct scsi_cmnd *a = list_entry(_a, typeof(*a), eh_entry);
> +	struct scsi_cmnd *b = list_entry(_b, typeof(*b), eh_entry);
> +	const sector_t pos_a = blk_rq_pos(scsi_cmd_to_rq(a));
> +	const sector_t pos_b = blk_rq_pos(scsi_cmd_to_rq(b));

The SCSI core has no concept of LBAs.  Even assuming something like
this is a good idea (and I'm very doubtful) it would have to live
in sd.c.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v5 3/7] scsi: core: Retry unaligned zoned writes
  2023-08-02 11:52   ` Christoph Hellwig
@ 2023-08-02 17:20     ` Bart Van Assche
  0 siblings, 0 replies; 17+ messages in thread
From: Bart Van Assche @ 2023-08-02 17:20 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, linux-block, Martin K . Petersen, Damien Le Moal,
	Ming Lei, James E.J. Bottomley

On 8/2/23 04:52, Christoph Hellwig wrote:
>> +/*
>> + * Returns a negative value if @_a has a lower LBA than @_b, zero if
>> + * both have the same LBA and a positive value otherwise.
>> + */
>> +static int scsi_cmp_lba(void *priv, const struct list_head *_a,
>> +			const struct list_head *_b)
>> +{
>> +	struct scsi_cmnd *a = list_entry(_a, typeof(*a), eh_entry);
>> +	struct scsi_cmnd *b = list_entry(_b, typeof(*b), eh_entry);
>> +	const sector_t pos_a = blk_rq_pos(scsi_cmd_to_rq(a));
>> +	const sector_t pos_b = blk_rq_pos(scsi_cmd_to_rq(b));
> 
> The SCSI core has no concept of LBAs.

Agreed.

> Even assuming something like this is a good idea (and I'm very
 > doubtful) it would have to live in sd.c.

I can rename scsi_cmp_lba() into scsi_cmp_pos() or scsi_cmp_sector(). As 
you know the flow of the sector information is as follows (the below is 
incomplete):
* The filesystem allocated a bio, sets bi_sector and calls
   bio_add_page().
* The filesystem calls submit_bio() and submit_bio() calls
   blk_mq_bio_to_request() indirectly. blk_mq_bio_to_request() copies
   bio->bi_iter.bi_sector into rq->__sector.
* scsi_queue_rq() is called and prepares a SCSI CDB by calling
   sd_init_command() via a function pointer in sd_template.
* The SCSI CDB is submitted to the LLD by calling
   host->hostt->queuecommand().

Since the rq->__sector information is provided before a request is 
submitted by the SCSI core and since SCSI ULD drivers are not allowed to 
modify that information, I think it is fine to read that information in 
the SCSI core by calling blk_rq_pos().

Moving the code for sorting SCSI commands by sector into sd.c would be 
cumbersome. It could be done e.g. as follows:
* Add a new function pointer in struct scsi_driver, e.g.
   prepare_reinsert.
* In the sd driver, provide an implementation for that callback and make
   it iterate over all requests and only sort those requests related to
   the sd driver.
* In the scsi_eh_flush_done_q(), add the following logic:
   - Iterate a first time over all SCSI commands that will be reinserted
     and perform the equivalent of the shell command sort -u on the
     scsi_driver.prepare_reinsert function pointers.
   - For each distinct scsi_driver.prepare_reinsert function pointer,
     call the function it points at and pass the entire list of commands
     to that function.

My opinion is that moving the logic of sorting SCSI commands by sector
into the sd driver will result in more complicated code without 
providing a real benefit. Did I perhaps overlook something?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v5 4/7] scsi: scsi_debug: Support disabling zone write locking
  2023-07-31 22:14 ` [PATCH v5 4/7] scsi: scsi_debug: Support disabling zone write locking Bart Van Assche
@ 2023-08-07 18:13   ` Douglas Gilbert
  0 siblings, 0 replies; 17+ messages in thread
From: Douglas Gilbert @ 2023-08-07 18:13 UTC (permalink / raw)
  To: Bart Van Assche, Jens Axboe
  Cc: linux-block, Christoph Hellwig, Martin K . Petersen,
	James E.J. Bottomley

On 2023-07-31 18:14, Bart Van Assche wrote:
> Make it easier to test handling of QUEUE_FLAG_NO_ZONE_WRITE_LOCK by
> adding support for setting this flag for scsi_debug request queues.
> 
> Cc: Martin K. Petersen <martin.petersen@oracle.com>
> Cc: Douglas Gilbert <dgilbert@interlog.com>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>

Acked-by: Douglas Gilbert <dgilbert@interlog.com>

Thanks.

> ---
>   drivers/scsi/scsi_debug.c | 8 ++++++++
>   1 file changed, 8 insertions(+)
> 
> diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
> index 9c0af50501f9..57c6242bfb26 100644
> --- a/drivers/scsi/scsi_debug.c
> +++ b/drivers/scsi/scsi_debug.c
> @@ -832,6 +832,7 @@ static int dix_reads;
>   static int dif_errors;
>   
>   /* ZBC global data */
> +static bool sdeb_no_zwrl;
>   static bool sdeb_zbc_in_use;	/* true for host-aware and host-managed disks */
>   static int sdeb_zbc_zone_cap_mb;
>   static int sdeb_zbc_zone_size_mb;
> @@ -5138,9 +5139,13 @@ static struct sdebug_dev_info *find_build_dev_info(struct scsi_device *sdev)
>   
>   static int scsi_debug_slave_alloc(struct scsi_device *sdp)
>   {
> +	struct request_queue *q = sdp->request_queue;
> +
>   	if (sdebug_verbose)
>   		pr_info("slave_alloc <%u %u %u %llu>\n",
>   		       sdp->host->host_no, sdp->channel, sdp->id, sdp->lun);
> +	if (sdeb_no_zwrl)
> +		blk_queue_flag_set(QUEUE_FLAG_NO_ZONE_WRITE_LOCK, q);
>   	return 0;
>   }
>   
> @@ -5738,6 +5743,7 @@ module_param_named(ndelay, sdebug_ndelay, int, S_IRUGO | S_IWUSR);
>   module_param_named(no_lun_0, sdebug_no_lun_0, int, S_IRUGO | S_IWUSR);
>   module_param_named(no_rwlock, sdebug_no_rwlock, bool, S_IRUGO | S_IWUSR);
>   module_param_named(no_uld, sdebug_no_uld, int, S_IRUGO);
> +module_param_named(no_zone_write_lock, sdeb_no_zwrl, bool, S_IRUGO);
>   module_param_named(num_parts, sdebug_num_parts, int, S_IRUGO);
>   module_param_named(num_tgts, sdebug_num_tgts, int, S_IRUGO | S_IWUSR);
>   module_param_named(opt_blks, sdebug_opt_blks, int, S_IRUGO);
> @@ -5812,6 +5818,8 @@ MODULE_PARM_DESC(ndelay, "response delay in nanoseconds (def=0 -> ignore)");
>   MODULE_PARM_DESC(no_lun_0, "no LU number 0 (def=0 -> have lun 0)");
>   MODULE_PARM_DESC(no_rwlock, "don't protect user data reads+writes (def=0)");
>   MODULE_PARM_DESC(no_uld, "stop ULD (e.g. sd driver) attaching (def=0))");
> +MODULE_PARM_DESC(no_zone_write_lock,
> +		 "Disable serialization of zoned writes (def=0)");
>   MODULE_PARM_DESC(num_parts, "number of partitions(def=0)");
>   MODULE_PARM_DESC(num_tgts, "number of targets per host to simulate(def=1)");
>   MODULE_PARM_DESC(opt_blks, "optimal transfer length in blocks (def=1024)");


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v5 5/7] scsi: scsi_debug: Support injecting unaligned write errors
  2023-07-31 22:14 ` [PATCH v5 5/7] scsi: scsi_debug: Support injecting unaligned write errors Bart Van Assche
@ 2023-08-07 18:13   ` Douglas Gilbert
  0 siblings, 0 replies; 17+ messages in thread
From: Douglas Gilbert @ 2023-08-07 18:13 UTC (permalink / raw)
  To: Bart Van Assche, Jens Axboe
  Cc: linux-block, Christoph Hellwig, Martin K . Petersen,
	James E.J. Bottomley

On 2023-07-31 18:14, Bart Van Assche wrote:
> Allow user space software, e.g. a blktests test, to inject unaligned
> write errors.
> 
> Cc: Martin K. Petersen <martin.petersen@oracle.com>
> Cc: Douglas Gilbert <dgilbert@interlog.com>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>

Acked-by: Douglas Gilbert <dgilbert@interlog.com>

Thanks.

> ---
>   drivers/scsi/scsi_debug.c | 12 +++++++++++-
>   1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
> index 57c6242bfb26..051b0605f11f 100644
> --- a/drivers/scsi/scsi_debug.c
> +++ b/drivers/scsi/scsi_debug.c
> @@ -181,6 +181,7 @@ static const char *sdebug_version_date = "20210520";
>   #define SDEBUG_OPT_NO_CDB_NOISE		0x4000
>   #define SDEBUG_OPT_HOST_BUSY		0x8000
>   #define SDEBUG_OPT_CMD_ABORT		0x10000
> +#define SDEBUG_OPT_UNALIGNED_WRITE	0x20000
>   #define SDEBUG_OPT_ALL_NOISE (SDEBUG_OPT_NOISE | SDEBUG_OPT_Q_NOISE | \
>   			      SDEBUG_OPT_RESET_NOISE)
>   #define SDEBUG_OPT_ALL_INJECTING (SDEBUG_OPT_RECOVERED_ERR | \
> @@ -188,7 +189,8 @@ static const char *sdebug_version_date = "20210520";
>   				  SDEBUG_OPT_DIF_ERR | SDEBUG_OPT_DIX_ERR | \
>   				  SDEBUG_OPT_SHORT_TRANSFER | \
>   				  SDEBUG_OPT_HOST_BUSY | \
> -				  SDEBUG_OPT_CMD_ABORT)
> +				  SDEBUG_OPT_CMD_ABORT | \
> +				  SDEBUG_OPT_UNALIGNED_WRITE)
>   #define SDEBUG_OPT_RECOV_DIF_DIX (SDEBUG_OPT_RECOVERED_ERR | \
>   				  SDEBUG_OPT_DIF_ERR | SDEBUG_OPT_DIX_ERR)
>   
> @@ -3587,6 +3589,14 @@ static int resp_write_dt0(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
>   	struct sdeb_store_info *sip = devip2sip(devip, true);
>   	u8 *cmd = scp->cmnd;
>   
> +	if (unlikely(sdebug_opts & SDEBUG_OPT_UNALIGNED_WRITE &&
> +		     atomic_read(&sdeb_inject_pending))) {
> +		atomic_set(&sdeb_inject_pending, 0);
> +		mk_sense_buffer(scp, ILLEGAL_REQUEST, LBA_OUT_OF_RANGE,
> +				UNALIGNED_WRITE_ASCQ);
> +		return check_condition_result;
> +	}
> +
>   	switch (cmd[0]) {
>   	case WRITE_16:
>   		ei_lba = 0;


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2023-08-07 18:19 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-31 22:14 [PATCH v5 0/7] Improve the performance for zoned UFS devices Bart Van Assche
2023-07-31 22:14 ` [PATCH v5 1/7] block: Introduce the flag QUEUE_FLAG_NO_ZONE_WRITE_LOCK Bart Van Assche
2023-08-02  9:12   ` Damien Le Moal
2023-07-31 22:14 ` [PATCH v5 2/7] block/mq-deadline: Only use zone locking if necessary Bart Van Assche
2023-08-02  9:12   ` Damien Le Moal
2023-07-31 22:14 ` [PATCH v5 3/7] scsi: core: Retry unaligned zoned writes Bart Van Assche
2023-08-02  9:12   ` Damien Le Moal
2023-08-02  9:16   ` Hannes Reinecke
2023-08-02  9:31     ` Damien Le Moal
2023-08-02 11:52   ` Christoph Hellwig
2023-08-02 17:20     ` Bart Van Assche
2023-07-31 22:14 ` [PATCH v5 4/7] scsi: scsi_debug: Support disabling zone write locking Bart Van Assche
2023-08-07 18:13   ` Douglas Gilbert
2023-07-31 22:14 ` [PATCH v5 5/7] scsi: scsi_debug: Support injecting unaligned write errors Bart Van Assche
2023-08-07 18:13   ` Douglas Gilbert
2023-07-31 22:14 ` [PATCH v5 6/7] scsi: ufs: Split an if-condition Bart Van Assche
2023-07-31 22:14 ` [PATCH v5 7/7] scsi: ufs: Disable zone write locking Bart Van Assche

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.