All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] Zoned block device support fixes
@ 2017-08-01  9:39 Damien Le Moal
  2017-08-01  9:39 ` [PATCH 1/2] block: Zoned block device single-threaded submission Damien Le Moal
  2017-08-01  9:39 ` [PATCH 2/2] sd_zbc: Disable zone locking with scsi-mq enabled Damien Le Moal
  0 siblings, 2 replies; 4+ messages in thread
From: Damien Le Moal @ 2017-08-01  9:39 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, Jens Axboe
  Cc: Hannes Reinecke, Bart Van Assche, Christoph Hellwig

This small series addresses a couple of problems detected with 4.13-rc.

The first patch ensures that a well behaved host managed zoned block device
user (an application doing direct disk accesses, f2fs or dm-zoned) will not
see unaligned write errors due to reordering of write commands at dispatch
time.

The second patch addresses a request dispatch deadlock that can very easily
trigger with f2fs or dm-zoned when scsi-mq is enabled. The root cause of this
problem is the high probability of unintended reordering of sequential writes
in the dispatch queue due to concurrent requeue and insert events. This
patch only fixes the deadlock problem and is not a fix for the root cause
itself, which will require most likely a different approach as no easy fix
seem to be possible at the scsi-mq level.

This means that host managed zoned block devices cannot be reliably used under
a regular asynchronous (queued) write BIO issuing pattern with scsi-mq enabled
for now.

Damien Le Moal (1):
  sd_zbc: Disable zone locking with scsi-mq enabled

Hannes Reinecke (1):
  block: Zoned block device single-threaded submission

 block/blk-core.c      | 7 +++++++
 drivers/scsi/sd_zbc.c | 7 ++++++-
 2 files changed, 13 insertions(+), 1 deletion(-)

-- 
2.13.3

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 1/2] block: Zoned block device single-threaded submission
  2017-08-01  9:39 [PATCH 0/2] Zoned block device support fixes Damien Le Moal
@ 2017-08-01  9:39 ` Damien Le Moal
  2017-08-01  9:39 ` [PATCH 2/2] sd_zbc: Disable zone locking with scsi-mq enabled Damien Le Moal
  1 sibling, 0 replies; 4+ messages in thread
From: Damien Le Moal @ 2017-08-01  9:39 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, Jens Axboe
  Cc: Hannes Reinecke, Bart Van Assche, Christoph Hellwig

From: Hannes Reinecke <hare@suse.de>

The scsi_request_fn() dispatch function internally unlocks the request
queue before submitting a request to the underlying LLD. This can
potentially lead to write request reordering if the context executing
scsi_request_fn() is preempted before the request is submitted to the
LLD and another context start the same function execution.

This is not a problem for regular disks but leads to write I/O errors
on host managed zoned block devices and reduce the effectivness of
sequential write optimizations for host aware disks.
(Note: the zone write lock in place in the scsi command init code will
prevent multiple writes from being issued simultaneously to the same
zone to avoid HBA level reordering issues, but this locking mechanism
is ineffective to prevent reordering at this level)

Prevent this from happening by limiting the number of context that can
simultaneously execute the queue request_fn() function to a single
thread.

A similar patch was originally proposed by Hannes Reinecke in a first
set of patches implementing ZBC support but ultimately not included in
the final support implementation. See commit 92f5e2a295
"block: add flag for single-threaded submission" in the tree
https://git.kernel.org/pub/scm/linux/kernel/git/hare/scsi-devel.git/log/?h=zac.v3

Authorship thus goes to Hannes.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
---
 block/blk-core.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/block/blk-core.c b/block/blk-core.c
index dbecbf4a64e0..cf590cbddcfd 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -371,7 +371,14 @@ inline void __blk_run_queue_uncond(struct request_queue *q)
 	 * running such a request function concurrently. Keep track of the
 	 * number of active request_fn invocations such that blk_drain_queue()
 	 * can wait until all these request_fn calls have finished.
+	 *
+	 * For zoned block devices, do not allow multiple threads to
+	 * dequeue requests as this can lead to write request reordering
+	 * during the time the queue is unlocked.
 	 */
+	if (blk_queue_is_zoned(q) && q->request_fn_active)
+		return;
+
 	q->request_fn_active++;
 	q->request_fn(q);
 	q->request_fn_active--;
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 2/2] sd_zbc: Disable zone locking with scsi-mq enabled
  2017-08-01  9:39 [PATCH 0/2] Zoned block device support fixes Damien Le Moal
  2017-08-01  9:39 ` [PATCH 1/2] block: Zoned block device single-threaded submission Damien Le Moal
@ 2017-08-01  9:39 ` Damien Le Moal
  2017-08-01 13:46   ` Bart Van Assche
  1 sibling, 1 reply; 4+ messages in thread
From: Damien Le Moal @ 2017-08-01  9:39 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, Jens Axboe
  Cc: Hannes Reinecke, Bart Van Assche, Christoph Hellwig

Unlike the legacy blk-sq infrastructure, blk-mq cannot provide any sort
of guarantees to well behaved users regarding write command ordering for
zoned block devices. This is due to the lockless manipulation of the dispatch
queue as well as the use of different work queues for requeuing requests and
running queues. Request in flight may exist and a combination of context
preemption and request requeue event can easily reorder a sequential
write sequence in the dispatch queue.

This problem is not solved by the zone write lock mechanism in place in
sd_zbc.c (called from sd_init_cmnd() and sd_done()). On the contrary this
locking makes the problem even worse due to the high number of requeue events
it causes under heavy write workloads. Furthermore, this zone locking can
generate dispatch queue deadlocks if a write command that acquired a
zone lock is requeued and ends up behind a write command which has not been
yet prepared but is directed to the same locked zone. In such case, trying to
dispatch this write command will systematically return BLKPREP_DEFER and force
a requeue at the head of the dispatch queue. Subsequent dispatch execution will
fail again to issue any request in the same manner, stalling request dispatch
to the device endlessly.

This patch does not solve the blk-mq reordering problem, but prevent this
deadlock problem from hapenning by disabling zone locking when scsi-mq is
enabled.

Take away from this: ZBC/ZAC host managed drive cannot be used safely
with scsi-mq enabled for now, unless the drive users limit the number of
writes per sequential zone to 1 at all times. Both f2fs and dm-zoned do not
do this and so cannot be used safely with scsi-mq enabled.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
---
 drivers/scsi/sd_zbc.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index 96855df9f49d..78fb51a37e86 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -528,15 +528,20 @@ static int sd_zbc_check_zone_size(struct scsi_disk *sdkp)
 
 static int sd_zbc_setup(struct scsi_disk *sdkp)
 {
+	struct request_queue *q = sdkp->disk->queue;
 
 	/* chunk_sectors indicates the zone size */
-	blk_queue_chunk_sectors(sdkp->disk->queue,
+	blk_queue_chunk_sectors(q,
 			logical_to_sectors(sdkp->device, sdkp->zone_blocks));
 	sdkp->zone_shift = ilog2(sdkp->zone_blocks);
 	sdkp->nr_zones = sdkp->capacity >> sdkp->zone_shift;
 	if (sdkp->capacity & (sdkp->zone_blocks - 1))
 		sdkp->nr_zones++;
 
+	/* Do not use zone locking in mq case */
+	if (q->mq_ops)
+		return 0;
+
 	if (!sdkp->zones_wlock) {
 		sdkp->zones_wlock = kcalloc(BITS_TO_LONGS(sdkp->nr_zones),
 					    sizeof(unsigned long),
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH 2/2] sd_zbc: Disable zone locking with scsi-mq enabled
  2017-08-01  9:39 ` [PATCH 2/2] sd_zbc: Disable zone locking with scsi-mq enabled Damien Le Moal
@ 2017-08-01 13:46   ` Bart Van Assche
  0 siblings, 0 replies; 4+ messages in thread
From: Bart Van Assche @ 2017-08-01 13:46 UTC (permalink / raw)
  To: linux-scsi, Damien Le Moal, martin.petersen, axboe; +Cc: hch, hare

On Tue, 2017-08-01 at 18:39 +0900, Damien Le Moal wrote:
> diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
> index 96855df9f49d..78fb51a37e86 100644
> --- a/drivers/scsi/sd_zbc.c
> +++ b/drivers/scsi/sd_zbc.c
> @@ -528,15 +528,20 @@ static int sd_zbc_check_zone_size(struct scsi_disk *sdkp)
>  
>  static int sd_zbc_setup(struct scsi_disk *sdkp)
>  {
> +	struct request_queue *q = sdkp->disk->queue;
>  
>  	/* chunk_sectors indicates the zone size */
> -	blk_queue_chunk_sectors(sdkp->disk->queue,
> +	blk_queue_chunk_sectors(q,
>  			logical_to_sectors(sdkp->device, sdkp->zone_blocks));
>  	sdkp->zone_shift = ilog2(sdkp->zone_blocks);
>  	sdkp->nr_zones = sdkp->capacity >> sdkp->zone_shift;
>  	if (sdkp->capacity & (sdkp->zone_blocks - 1))
>  		sdkp->nr_zones++;
>  
> +	/* Do not use zone locking in mq case */
> +	if (q->mq_ops)
> +		return 0;
> +
>  	if (!sdkp->zones_wlock) {
>  		sdkp->zones_wlock = kcalloc(BITS_TO_LONGS(sdkp->nr_zones),
>  					    sizeof(unsigned long),

Hello Damien,

Are you aware that the blk-sq / scsi-sq code will be removed once all block
drivers have been converted? Are you aware that we don't want any differences
in behavior like the above between the single queue and multiqueue paths? Can
you check whether the patch series Ming Lei posted earlier this week solves
the frequent requeueing? See also "[PATCH 00/14] blk-mq-sched: fix SCSI-MQ
performance regression" (http://marc.info/?l=linux-block&m=150151989915776).

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-08-01 13:46 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-01  9:39 [PATCH 0/2] Zoned block device support fixes Damien Le Moal
2017-08-01  9:39 ` [PATCH 1/2] block: Zoned block device single-threaded submission Damien Le Moal
2017-08-01  9:39 ` [PATCH 2/2] sd_zbc: Disable zone locking with scsi-mq enabled Damien Le Moal
2017-08-01 13:46   ` Bart Van Assche

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.