LKML Archive on lore.kernel.org
 help / color / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Damien Le Moal <damien.lemoal@wdc.com>,
	Jens Axboe <axboe@kernel.dk>
Subject: [PATCH 4.20 27/65] block: mq-deadline: Fix write completion handling
Date: Fri, 11 Jan 2019 15:15:13 +0100
Message-ID: <20190111131100.086969226@linuxfoundation.org> (raw)
In-Reply-To: <20190111131055.331350141@linuxfoundation.org>

4.20-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Damien Le Moal <damien.lemoal@wdc.com>

commit 7211aef86f79583e59b88a0aba0bc830566f7e8e upstream.

For a zoned block device using mq-deadline, if a write request for a
zone is received while another write was already dispatched for the same
zone, dd_dispatch_request() will return NULL and the newly inserted
write request is kept in the scheduler queue waiting for the ongoing
zone write to complete. With this behavior, when no other request has
been dispatched, rq_list in blk_mq_sched_dispatch_requests() is empty
and blk_mq_sched_mark_restart_hctx() not called. This in turn leads to
__blk_mq_free_request() call of blk_mq_sched_restart() to not run the
queue when the already dispatched write request completes. The newly
dispatched request stays stuck in the scheduler queue until eventually
another request is submitted.

This problem does not affect SCSI disk as the SCSI stack handles queue
restart on request completion. However, this problem is can be triggered
the nullblk driver with zoned mode enabled.

Fix this by always requesting a queue restart in dd_dispatch_request()
if no request was dispatched while WRITE requests are queued.

Fixes: 5700f69178e9 ("mq-deadline: Introduce zone locking support")
Cc: <stable@vger.kernel.org>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Add missing export of blk_mq_sched_restart()

Signed-off-by: Jens Axboe <axboe@kernel.dk>

---
 block/blk-mq-sched.c |    3 ++-
 block/blk-mq-sched.h |    1 +
 block/mq-deadline.c  |   12 +++++++++++-
 3 files changed, 14 insertions(+), 2 deletions(-)

--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -54,13 +54,14 @@ void blk_mq_sched_assign_ioc(struct requ
  * Mark a hardware queue as needing a restart. For shared queues, maintain
  * a count of how many hardware queues are marked for restart.
  */
-static void blk_mq_sched_mark_restart_hctx(struct blk_mq_hw_ctx *hctx)
+void blk_mq_sched_mark_restart_hctx(struct blk_mq_hw_ctx *hctx)
 {
 	if (test_bit(BLK_MQ_S_SCHED_RESTART, &hctx->state))
 		return;
 
 	set_bit(BLK_MQ_S_SCHED_RESTART, &hctx->state);
 }
+EXPORT_SYMBOL_GPL(blk_mq_sched_mark_restart_hctx);
 
 void blk_mq_sched_restart(struct blk_mq_hw_ctx *hctx)
 {
--- a/block/blk-mq-sched.h
+++ b/block/blk-mq-sched.h
@@ -15,6 +15,7 @@ bool blk_mq_sched_try_merge(struct reque
 				struct request **merged_request);
 bool __blk_mq_sched_bio_merge(struct request_queue *q, struct bio *bio);
 bool blk_mq_sched_try_insert_merge(struct request_queue *q, struct request *rq);
+void blk_mq_sched_mark_restart_hctx(struct blk_mq_hw_ctx *hctx);
 void blk_mq_sched_restart(struct blk_mq_hw_ctx *hctx);
 
 void blk_mq_sched_insert_request(struct request *rq, bool at_head,
--- a/block/mq-deadline.c
+++ b/block/mq-deadline.c
@@ -373,9 +373,16 @@ done:
 
 /*
  * One confusing aspect here is that we get called for a specific
- * hardware queue, but we return a request that may not be for a
+ * hardware queue, but we may return a request that is for a
  * different hardware queue. This is because mq-deadline has shared
  * state for all hardware queues, in terms of sorting, FIFOs, etc.
+ *
+ * For a zoned block device, __dd_dispatch_request() may return NULL
+ * if all the queued write requests are directed at zones that are already
+ * locked due to on-going write requests. In this case, make sure to mark
+ * the queue as needing a restart to ensure that the queue is run again
+ * and the pending writes dispatched once the target zones for the ongoing
+ * write requests are unlocked in dd_finish_request().
  */
 static struct request *dd_dispatch_request(struct blk_mq_hw_ctx *hctx)
 {
@@ -384,6 +391,9 @@ static struct request *dd_dispatch_reque
 
 	spin_lock(&dd->lock);
 	rq = __dd_dispatch_request(dd);
+	if (!rq && blk_queue_is_zoned(hctx->queue) &&
+	    !list_empty(&dd->fifo_list[WRITE]))
+		blk_mq_sched_mark_restart_hctx(hctx);
 	spin_unlock(&dd->lock);
 
 	return rq;



  parent reply index

Thread overview: 75+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-11 14:14 [PATCH 4.20 00/65] 4.20.2-stable review Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.20 01/65] scsi: zfcp: fix posting too many status read buffers leading to adapter shutdown Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.20 02/65] scsi: lpfc: do not set queue->page_count to 0 if pc_sli4_params.wqpcnt is invalid Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.20 03/65] fork: record start_time late Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.20 04/65] zram: fix double free backing device Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.20 05/65] hwpoison, memory_hotplug: allow hwpoisoned pages to be offlined Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.20 06/65] mm, devm_memremap_pages: mark devm_memremap_pages() EXPORT_SYMBOL_GPL Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.20 07/65] mm, devm_memremap_pages: kill mapping "System RAM" support Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.20 08/65] mm, devm_memremap_pages: fix shutdown handling Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.20 09/65] memcg, oom: notify on oom killer invocation from the charge path Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.20 10/65] sunrpc: fix cache_head leak due to queued request Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.20 11/65] sunrpc: use SVC_NET() in svcauth_gss_* functions Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.20 12/65] mm, devm_memremap_pages: add MEMORY_DEVICE_PRIVATE support Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.20 13/65] mm, hmm: use devm semantics for hmm_devmem_{add, remove} Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.20 14/65] mm, hmm: replace hmm_devmem_pages_create() with devm_memremap_pages() Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.20 15/65] mm, hmm: mark hmm_devmem_{add, add_resource} EXPORT_SYMBOL_GPL Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.20 16/65] mm, swap: fix swapoff with KSM pages Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.20 17/65] media: cx23885: only reset DMA on problematic CPUs Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.20 18/65] ALSA: cs46xx: Potential NULL dereference in probe Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.20 19/65] ALSA: usb-audio: Avoid access before bLength check in build_audio_procunit() Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.20 20/65] ALSA: usb-audio: Check mixer unit descriptors more strictly Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.20 21/65] ALSA: usb-audio: Fix an out-of-bound read in create_composite_quirks Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.20 22/65] ALSA: usb-audio: Always check descriptor sizes in parser code Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.20 23/65] srcu: Lock srcu_data structure in srcu_gp_start() Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.20 24/65] driver core: Add missing dev->bus->need_parent_lock checks Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.20 25/65] Fix failure path in alloc_pid() Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.20 26/65] block: deactivate blk_stat timer in wbt_disable_default() Greg Kroah-Hartman
2019-01-11 14:15 ` Greg Kroah-Hartman [this message]
2019-01-11 14:15 ` [PATCH 4.20 28/65] dm: do not allow readahead to limit IO size Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.20 29/65] dlm: fixed memory leaks after failed ls_remove_names allocation Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.20 30/65] dlm: possible memory leak on error path in create_lkb() Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.20 31/65] dlm: lost put_lkb on error path in receive_convert() and receive_unlock() Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.20 32/65] dlm: memory leaks on error path in dlm_user_request() Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.20 33/65] gfs2: Get rid of potential double-freeing in gfs2_create_inode Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.20 34/65] gfs2: Fix loop in gfs2_rbm_find Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.20 35/65] b43: Fix error in cordic routine Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.20 36/65] selinux: policydb - fix byte order and alignment issues Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.20 37/65] PCI / PM: Allow runtime PM without callback functions Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.20 38/65] lockd: Show pid of lockd for remote locks Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.20 39/65] xprtrdma: Yet another double DMA-unmap Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.20 40/65] nfsd4: zero-length WRITE should succeed Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.20 41/65] Revert "powerpc/tm: Unset MSR[TS] if not recheckpointing" Greg Kroah-Hartman
2019-01-12 21:35   ` Christoph Biedl
2019-01-13  7:11     ` Greg Kroah-Hartman
2019-01-14  0:00       ` Michael Ellerman
2019-01-14  8:10         ` Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.20 42/65] powerpc/tm: Set MSR[TS] just prior to recheckpoint Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.20 43/65] iio: adc: qcom-spmi-adc5: Initialize prescale properly Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.20 44/65] iio: dac: ad5686: fix bit shift read register Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.20 45/65] 9p/net: put a lower bound on msize Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.20 46/65] rxe: fix error completion wr_id and qp_num Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.20 47/65] stm class: Fix a module refcount leak in policy creation error path Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.20 48/65] RDMA/srpt: Fix a use-after-free in the channel release code Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.20 49/65] RDMA/iwcm: Dont copy past the end of dev_name() string Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.20 50/65] iommu/vt-d: Handle domain agaw being less than iommu agaw Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.20 51/65] sched/fair: Fix infinite loop in update_blocked_averages() by reverting a9e7f6544b9c Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.20 52/65] ceph: dont update importing caps mseq when handing cap export Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.20 53/65] video: fbdev: pxafb: Fix "WARNING: invalid free of devm_ allocated data" Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.20 54/65] drivers/perf: hisi: Fixup one DDRC PMU register offset Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.20 55/65] powerpc/4xx/ocm: Fix compilation error due to PAGE_KERNEL usage Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.20 56/65] selftests: Fix test errors related to lib.mk khdr target Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.20 57/65] genwqe: Fix size check Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.20 58/65] intel_th: msu: Fix an off-by-one in attribute store Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.20 59/65] power: supply: olpc_battery: correct the temperature units Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.20 60/65] of: of_node_get()/of_node_put() nodes held in phandle cache Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.20 61/65] of: __of_detach_node() - remove node from " Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.20 62/65] lib: fix build failure in CONFIG_DEBUG_VIRTUAL test Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.20 63/65] drm/nouveau/drm/nouveau: Check rc from drm_dp_mst_topology_mgr_resume() Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.20 64/65] drm/vc4: Set ->is_yuv to false when num_planes == 1 Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.20 65/65] drm/rockchip: psr: do not dereference encoder before it is null checked Greg Kroah-Hartman
2019-01-11 21:35 ` [PATCH 4.20 00/65] 4.20.2-stable review shuah
2019-01-12  8:03   ` Greg Kroah-Hartman
2019-01-12  8:28 ` Naresh Kamboju
2019-01-12 17:35   ` Greg Kroah-Hartman
2019-01-12 17:45 ` Guenter Roeck

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190111131100.086969226@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=axboe@kernel.dk \
    --cc=damien.lemoal@wdc.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git