linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Hans Holmberg <Hans.Holmberg@wdc.com>,
	Hans Holmberg <hans.holmberg@wdc.com>,
	Christoph Hellwig <hch@lst.de>,
	Damien Le Moal <damien.lemoal@wdc.com>,
	Jens Axboe <axboe@kernel.dk>, Sasha Levin <sashal@kernel.org>
Subject: [PATCH 4.19 078/106] block: mq-deadline: Fix queue restart handling
Date: Sun,  6 Oct 2019 19:21:24 +0200	[thread overview]
Message-ID: <20191006171156.638428126@linuxfoundation.org> (raw)
In-Reply-To: <20191006171124.641144086@linuxfoundation.org>

From: Damien Le Moal <damien.lemoal@wdc.com>

[ Upstream commit cb8acabbe33b110157955a7425ee876fb81e6bbc ]

Commit 7211aef86f79 ("block: mq-deadline: Fix write completion
handling") added a call to blk_mq_sched_mark_restart_hctx() in
dd_dispatch_request() to make sure that write request dispatching does
not stall when all target zones are locked. This fix left a subtle race
when a write completion happens during a dispatch execution on another
CPU:

CPU 0: Dispatch			CPU1: write completion

dd_dispatch_request()
    lock(&dd->lock);
    ...
    lock(&dd->zone_lock);	dd_finish_request()
    rq = find request		lock(&dd->zone_lock);
    unlock(&dd->zone_lock);
    				zone write unlock
				unlock(&dd->zone_lock);
				...
				__blk_mq_free_request
                                      check restart flag (not set)
				      -> queue not run
    ...
    if (!rq && have writes)
        blk_mq_sched_mark_restart_hctx()
    unlock(&dd->lock)

Since the dispatch context finishes after the write request completion
handling, marking the queue as needing a restart is not seen from
__blk_mq_free_request() and blk_mq_sched_restart() not executed leading
to the dispatch stall under 100% write workloads.

Fix this by moving the call to blk_mq_sched_mark_restart_hctx() from
dd_dispatch_request() into dd_finish_request() under the zone lock to
ensure full mutual exclusion between write request dispatch selection
and zone unlock on write request completion.

Fixes: 7211aef86f79 ("block: mq-deadline: Fix write completion handling")
Cc: stable@vger.kernel.org
Reported-by: Hans Holmberg <Hans.Holmberg@wdc.com>
Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 block/mq-deadline.c | 23 +++++++++++++----------
 1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/block/mq-deadline.c b/block/mq-deadline.c
index d5e21ce44d2cc..69094d6410623 100644
--- a/block/mq-deadline.c
+++ b/block/mq-deadline.c
@@ -376,13 +376,6 @@ static struct request *__dd_dispatch_request(struct deadline_data *dd)
  * hardware queue, but we may return a request that is for a
  * different hardware queue. This is because mq-deadline has shared
  * state for all hardware queues, in terms of sorting, FIFOs, etc.
- *
- * For a zoned block device, __dd_dispatch_request() may return NULL
- * if all the queued write requests are directed at zones that are already
- * locked due to on-going write requests. In this case, make sure to mark
- * the queue as needing a restart to ensure that the queue is run again
- * and the pending writes dispatched once the target zones for the ongoing
- * write requests are unlocked in dd_finish_request().
  */
 static struct request *dd_dispatch_request(struct blk_mq_hw_ctx *hctx)
 {
@@ -391,9 +384,6 @@ static struct request *dd_dispatch_request(struct blk_mq_hw_ctx *hctx)
 
 	spin_lock(&dd->lock);
 	rq = __dd_dispatch_request(dd);
-	if (!rq && blk_queue_is_zoned(hctx->queue) &&
-	    !list_empty(&dd->fifo_list[WRITE]))
-		blk_mq_sched_mark_restart_hctx(hctx);
 	spin_unlock(&dd->lock);
 
 	return rq;
@@ -559,6 +549,13 @@ static void dd_prepare_request(struct request *rq, struct bio *bio)
  * spinlock so that the zone is never unlocked while deadline_fifo_request()
  * or deadline_next_request() are executing. This function is called for
  * all requests, whether or not these requests complete successfully.
+ *
+ * For a zoned block device, __dd_dispatch_request() may have stopped
+ * dispatching requests if all the queued requests are write requests directed
+ * at zones that are already locked due to on-going write requests. To ensure
+ * write request dispatch progress in this case, mark the queue as needing a
+ * restart to ensure that the queue is run again after completion of the
+ * request and zones being unlocked.
  */
 static void dd_finish_request(struct request *rq)
 {
@@ -570,6 +567,12 @@ static void dd_finish_request(struct request *rq)
 
 		spin_lock_irqsave(&dd->zone_lock, flags);
 		blk_req_zone_write_unlock(rq);
+		if (!list_empty(&dd->fifo_list[WRITE])) {
+			struct blk_mq_hw_ctx *hctx;
+
+			hctx = blk_mq_map_queue(q, rq->mq_ctx->cpu);
+			blk_mq_sched_mark_restart_hctx(hctx);
+		}
 		spin_unlock_irqrestore(&dd->zone_lock, flags);
 	}
 }
-- 
2.20.1




  parent reply	other threads:[~2019-10-06 17:58 UTC|newest]

Thread overview: 123+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-06 17:20 [PATCH 4.19 000/106] 4.19.78-stable review Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 001/106] tpm: use tpm_try_get_ops() in tpm-sysfs.c Greg Kroah-Hartman
2019-10-08 12:51   ` Pavel Machek
2019-10-08 23:45     ` Jarkko Sakkinen
2019-10-06 17:20 ` [PATCH 4.19 002/106] tpm: Fix TPM 1.2 Shutdown sequence to prevent future TPM operations Greg Kroah-Hartman
2019-10-08  9:41   ` Pavel Machek
2019-10-08 23:40     ` Jarkko Sakkinen
2019-10-06 17:20 ` [PATCH 4.19 003/106] drm/bridge: tc358767: Increase AUX transfer length limit Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 004/106] drm/panel: simple: fix AUO g185han01 horizontal blanking Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 005/106] video: ssd1307fb: Start page range at page_offset Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 006/106] drm/stm: attach gem fence to atomic state Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 007/106] drm/panel: check failure cases in the probe func Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 008/106] drm/rockchip: Check for fast link training before enabling psr Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 009/106] drm/radeon: Fix EEH during kexec Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 010/106] gpu: drm: radeon: Fix a possible null-pointer dereference in radeon_connector_set_property() Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 011/106] PCI: rpaphp: Avoid a sometimes-uninitialized warning Greg Kroah-Hartman
2019-10-08  9:45   ` Pavel Machek
2019-10-06 17:20 ` [PATCH 4.19 012/106] ipmi_si: Only schedule continuously in the thread in maintenance mode Greg Kroah-Hartman
2019-10-08  9:49   ` Pavel Machek
2019-10-08 12:19     ` Corey Minyard
2019-10-06 17:20 ` [PATCH 4.19 013/106] clk: qoriq: Fix -Wunused-const-variable Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 014/106] clk: sunxi-ng: v3s: add missing clock slices for MMC2 module clocks Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 015/106] drm/amd/display: fix issue where 252-255 values are clipped Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 016/106] drm/amd/display: reprogram VM config when system resume Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 017/106] powerpc/powernv/ioda2: Allocate TCE table levels on demand for default DMA window Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 018/106] clk: actions: Dont reference clk_init_data after registration Greg Kroah-Hartman
2019-10-08 12:20   ` Pavel Machek
2019-10-06 17:20 ` [PATCH 4.19 019/106] clk: sirf: " Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 020/106] clk: sprd: " Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 021/106] clk: zx296718: " Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 022/106] powerpc/xmon: Check for HV mode when dumping XIVE info from OPAL Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 023/106] powerpc/rtas: use device model APIs and serialization during LPM Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 024/106] powerpc/futex: Fix warning: oldval may be used uninitialized in this function Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 025/106] powerpc/pseries/mobility: use cond_resched when updating device tree Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 026/106] pinctrl: tegra: Fix write barrier placement in pmx_writel Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 027/106] powerpc/eeh: Clear stale EEH_DEV_NO_HANDLER flag Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 028/106] vfio_pci: Restore original state on release Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 029/106] drm/nouveau/volt: Fix for some cards having 0 maximum voltage Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 030/106] pinctrl: amd: disable spurious-firing GPIO IRQs Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 031/106] clk: renesas: mstp: Set GENPD_FLAG_ALWAYS_ON for clock domain Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 032/106] clk: renesas: cpg-mssr: " Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 033/106] drm/amd/display: support spdif Greg Kroah-Hartman
2019-10-08 13:37   ` Pavel Machek
2019-10-08 15:17     ` Greg Kroah-Hartman
2019-10-08 15:32     ` Lakha, Bhawanpreet
2019-10-06 17:20 ` [PATCH 4.19 034/106] drm/amdgpu/si: fix ASIC tests Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 035/106] powerpc/64s/exception: machine check use correct cfar for late handler Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 036/106] pstore: fs superblock limits Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 037/106] clk: qcom: gcc-sdm845: Use floor ops for sdcc clks Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 038/106] powerpc/pseries: correctly track irq state in default idle Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 039/106] pinctrl: meson-gxbb: Fix wrong pinning definition for uart_c Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 040/106] arm64: fix unreachable code issue with cmpxchg Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 041/106] clk: at91: select parent if main oscillator or bypass is enabled Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 042/106] powerpc: dump kernel log before carrying out fadump or kdump Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 043/106] mbox: qcom: add APCS child device for QCS404 Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 044/106] clk: sprd: add missing kfree Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 045/106] scsi: core: Reduce memory required for SCSI logging Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 046/106] dma-buf/sw_sync: Synchronize signal vs syncpt free Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 047/106] ext4: fix potential use after free after remounting with noblock_validity Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 048/106] MIPS: Ingenic: Disable broken BTB lookup optimization Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 049/106] MIPS: tlbex: Explicitly cast _PAGE_NO_EXEC to a boolean Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 050/106] i2c-cht-wc: Fix lockdep warning Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 051/106] mfd: intel-lpss: Remove D3cold delay Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 052/106] PCI: tegra: Fix OF node reference leak Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 053/106] HID: wacom: Fix several minor compiler warnings Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 054/106] livepatch: Nullify obj->mod in klp_module_coming()s error path Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 055/106] ARM: 8898/1: mm: Dont treat faults reported from cache maintenance as writes Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 056/106] soundwire: intel: fix channel number reported by hardware Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 057/106] ARM: 8875/1: Kconfig: default to AEABI w/ Clang Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 058/106] rtc: snvs: fix possible race condition Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 059/106] rtc: pcf85363/pcf85263: fix regmap error in set_time Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 060/106] HID: apple: Fix stuck function keys when using FN Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 061/106] PCI: rockchip: Propagate errors for optional regulators Greg Kroah-Hartman
2019-10-08 12:24   ` Pavel Machek
2019-10-06 17:21 ` [PATCH 4.19 062/106] PCI: histb: " Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 063/106] PCI: imx6: " Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 064/106] PCI: exynos: Propagate errors for optional PHYs Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 065/106] security: smack: Fix possible null-pointer dereferences in smack_socket_sock_rcv_skb() Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 066/106] ARM: 8903/1: ensure that usable memory in bank 0 starts from a PMD-aligned address Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 067/106] fat: work around race with userspaces read via blockdev while mounting Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 068/106] pktcdvd: remove warning on attempting to register non-passthrough dev Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 069/106] hypfs: Fix error number left in struct pointer member Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 070/106] crypto: hisilicon - Fix double free in sec_free_hw_sgl() Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 071/106] kbuild: clean compressed initramfs image Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 072/106] ocfs2: wait for recovering done after direct unlock request Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 073/106] kmemleak: increase DEBUG_KMEMLEAK_EARLY_LOG_SIZE default to 16K Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 074/106] arm64: consider stack randomization for mmap base only when necessary Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 075/106] mips: properly account for stack randomization and stack guard gap Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 076/106] arm: " Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 077/106] arm: use STACK_TOP when computing mmap base address Greg Kroah-Hartman
2019-10-06 17:21 ` Greg Kroah-Hartman [this message]
2019-10-06 17:21 ` [PATCH 4.19 079/106] bpf: fix use after free in prog symbol exposure Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 080/106] cxgb4:Fix out-of-bounds MSI-X info array access Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 081/106] erspan: remove the incorrect mtu limit for erspan Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 082/106] hso: fix NULL-deref on tty open Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 083/106] ipv6: drop incoming packets having a v4mapped source address Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 084/106] ipv6: Handle missing host route in __ipv6_ifa_notify Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 085/106] net: ipv4: avoid mixed n_redirects and rate_tokens usage Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 086/106] net: qlogic: Fix memory leak in ql_alloc_large_buffers Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 087/106] net: Unpublish sk from sk_reuseport_cb before call_rcu Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 088/106] nfc: fix memory leak in llcp_sock_bind() Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 089/106] qmi_wwan: add support for Cinterion CLS8 devices Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 090/106] rxrpc: Fix rxrpc_recvmsg tracepoint Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 091/106] sch_dsmark: fix potential NULL deref in dsmark_init() Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 092/106] udp: fix gso_segs calculations Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 093/106] vsock: Fix a lockdep warning in __vsock_release() Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 094/106] net: dsa: rtl8366: Check VLAN ID and not ports Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 095/106] udp: only do GSO if # of segs > 1 Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 096/106] net/rds: Fix error handling in rds_ib_add_one() Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 097/106] xen-netfront: do not use ~0U as error return value for xennet_fill_frags() Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 098/106] tipc: fix unlimited bundling of small messages Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 099/106] sch_cbq: validate TCA_CBQ_WRROPT to avoid crash Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 100/106] soundwire: Kconfig: fix help format Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 101/106] soundwire: fix regmap dependencies and align with other serial links Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 102/106] Smack: Dont ignore other bprm->unsafe flags if LSM_UNSAFE_PTRACE is set Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 103/106] smack: use GFP_NOFS while holding inode_smack::smk_lock Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 104/106] NFC: fix attrs checks in netlink interface Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 105/106] kexec: bail out upon SIGKILL when allocating memory Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 106/106] 9p/cache.c: Fix memory leak in v9fs_cache_session_get_cookie Greg Kroah-Hartman
2019-10-07  0:01 ` [PATCH 4.19 000/106] 4.19.78-stable review kernelci.org bot
2019-10-07 10:08 ` Jon Hunter
2019-10-07 14:32 ` Guenter Roeck
2019-10-07 16:17 ` Daniel Díaz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191006171156.638428126@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=Hans.Holmberg@wdc.com \
    --cc=axboe@kernel.dk \
    --cc=damien.lemoal@wdc.com \
    --cc=hch@lst.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).