All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Hans Holmberg <Hans.Holmberg@wdc.com>,
	Hans Holmberg <hans.holmberg@wdc.com>,
	Christoph Hellwig <hch@lst.de>,
	Damien Le Moal <damien.lemoal@wdc.com>,
	Jens Axboe <axboe@kernel.dk>, Sasha Levin <sashal@kernel.org>
Subject: [PATCH 4.19 078/106] block: mq-deadline: Fix queue restart handling
Date: Sun,  6 Oct 2019 19:21:24 +0200	[thread overview]
Message-ID: <20191006171156.638428126@linuxfoundation.org> (raw)
In-Reply-To: <20191006171124.641144086@linuxfoundation.org>

From: Damien Le Moal <damien.lemoal@wdc.com>

[ Upstream commit cb8acabbe33b110157955a7425ee876fb81e6bbc ]

Commit 7211aef86f79 ("block: mq-deadline: Fix write completion
handling") added a call to blk_mq_sched_mark_restart_hctx() in
dd_dispatch_request() to make sure that write request dispatching does
not stall when all target zones are locked. This fix left a subtle race
when a write completion happens during a dispatch execution on another
CPU:

CPU 0: Dispatch			CPU1: write completion

dd_dispatch_request()
    lock(&dd->lock);
    ...
    lock(&dd->zone_lock);	dd_finish_request()
    rq = find request		lock(&dd->zone_lock);
    unlock(&dd->zone_lock);
    				zone write unlock
				unlock(&dd->zone_lock);
				...
				__blk_mq_free_request
                                      check restart flag (not set)
				      -> queue not run
    ...
    if (!rq && have writes)
        blk_mq_sched_mark_restart_hctx()
    unlock(&dd->lock)

Since the dispatch context finishes after the write request completion
handling, marking the queue as needing a restart is not seen from
__blk_mq_free_request() and blk_mq_sched_restart() not executed leading
to the dispatch stall under 100% write workloads.

Fix this by moving the call to blk_mq_sched_mark_restart_hctx() from
dd_dispatch_request() into dd_finish_request() under the zone lock to
ensure full mutual exclusion between write request dispatch selection
and zone unlock on write request completion.

Fixes: 7211aef86f79 ("block: mq-deadline: Fix write completion handling")
Cc: stable@vger.kernel.org
Reported-by: Hans Holmberg <Hans.Holmberg@wdc.com>
Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 block/mq-deadline.c | 23 +++++++++++++----------
 1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/block/mq-deadline.c b/block/mq-deadline.c
index d5e21ce44d2cc..69094d6410623 100644
--- a/block/mq-deadline.c
+++ b/block/mq-deadline.c
@@ -376,13 +376,6 @@ static struct request *__dd_dispatch_request(struct deadline_data *dd)
  * hardware queue, but we may return a request that is for a
  * different hardware queue. This is because mq-deadline has shared
  * state for all hardware queues, in terms of sorting, FIFOs, etc.
- *
- * For a zoned block device, __dd_dispatch_request() may return NULL
- * if all the queued write requests are directed at zones that are already
- * locked due to on-going write requests. In this case, make sure to mark
- * the queue as needing a restart to ensure that the queue is run again
- * and the pending writes dispatched once the target zones for the ongoing
- * write requests are unlocked in dd_finish_request().
  */
 static struct request *dd_dispatch_request(struct blk_mq_hw_ctx *hctx)
 {
@@ -391,9 +384,6 @@ static struct request *dd_dispatch_request(struct blk_mq_hw_ctx *hctx)
 
 	spin_lock(&dd->lock);
 	rq = __dd_dispatch_request(dd);
-	if (!rq && blk_queue_is_zoned(hctx->queue) &&
-	    !list_empty(&dd->fifo_list[WRITE]))
-		blk_mq_sched_mark_restart_hctx(hctx);
 	spin_unlock(&dd->lock);
 
 	return rq;
@@ -559,6 +549,13 @@ static void dd_prepare_request(struct request *rq, struct bio *bio)
  * spinlock so that the zone is never unlocked while deadline_fifo_request()
  * or deadline_next_request() are executing. This function is called for
  * all requests, whether or not these requests complete successfully.
+ *
+ * For a zoned block device, __dd_dispatch_request() may have stopped
+ * dispatching requests if all the queued requests are write requests directed
+ * at zones that are already locked due to on-going write requests. To ensure
+ * write request dispatch progress in this case, mark the queue as needing a
+ * restart to ensure that the queue is run again after completion of the
+ * request and zones being unlocked.
  */
 static void dd_finish_request(struct request *rq)
 {
@@ -570,6 +567,12 @@ static void dd_finish_request(struct request *rq)
 
 		spin_lock_irqsave(&dd->zone_lock, flags);
 		blk_req_zone_write_unlock(rq);
+		if (!list_empty(&dd->fifo_list[WRITE])) {
+			struct blk_mq_hw_ctx *hctx;
+
+			hctx = blk_mq_map_queue(q, rq->mq_ctx->cpu);
+			blk_mq_sched_mark_restart_hctx(hctx);
+		}
 		spin_unlock_irqrestore(&dd->zone_lock, flags);
 	}
 }
-- 
2.20.1




  parent reply	other threads:[~2019-10-06 17:58 UTC|newest]

Thread overview: 126+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-06 17:20 [PATCH 4.19 000/106] 4.19.78-stable review Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 001/106] tpm: use tpm_try_get_ops() in tpm-sysfs.c Greg Kroah-Hartman
2019-10-08 12:51   ` Pavel Machek
2019-10-08 23:45     ` Jarkko Sakkinen
2019-10-06 17:20 ` [PATCH 4.19 002/106] tpm: Fix TPM 1.2 Shutdown sequence to prevent future TPM operations Greg Kroah-Hartman
2019-10-08  9:41   ` Pavel Machek
2019-10-08 23:40     ` Jarkko Sakkinen
2019-10-06 17:20 ` [PATCH 4.19 003/106] drm/bridge: tc358767: Increase AUX transfer length limit Greg Kroah-Hartman
2019-10-06 17:20   ` Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 004/106] drm/panel: simple: fix AUO g185han01 horizontal blanking Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 005/106] video: ssd1307fb: Start page range at page_offset Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 006/106] drm/stm: attach gem fence to atomic state Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 007/106] drm/panel: check failure cases in the probe func Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 008/106] drm/rockchip: Check for fast link training before enabling psr Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 009/106] drm/radeon: Fix EEH during kexec Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 010/106] gpu: drm: radeon: Fix a possible null-pointer dereference in radeon_connector_set_property() Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 011/106] PCI: rpaphp: Avoid a sometimes-uninitialized warning Greg Kroah-Hartman
2019-10-08  9:45   ` Pavel Machek
2019-10-06 17:20 ` [PATCH 4.19 012/106] ipmi_si: Only schedule continuously in the thread in maintenance mode Greg Kroah-Hartman
2019-10-08  9:49   ` Pavel Machek
2019-10-08 12:19     ` Corey Minyard
2019-10-06 17:20 ` [PATCH 4.19 013/106] clk: qoriq: Fix -Wunused-const-variable Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 014/106] clk: sunxi-ng: v3s: add missing clock slices for MMC2 module clocks Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 015/106] drm/amd/display: fix issue where 252-255 values are clipped Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 016/106] drm/amd/display: reprogram VM config when system resume Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 017/106] powerpc/powernv/ioda2: Allocate TCE table levels on demand for default DMA window Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 018/106] clk: actions: Dont reference clk_init_data after registration Greg Kroah-Hartman
2019-10-08 12:20   ` Pavel Machek
2019-10-06 17:20 ` [PATCH 4.19 019/106] clk: sirf: " Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 020/106] clk: sprd: " Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 021/106] clk: zx296718: " Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 022/106] powerpc/xmon: Check for HV mode when dumping XIVE info from OPAL Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 023/106] powerpc/rtas: use device model APIs and serialization during LPM Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 024/106] powerpc/futex: Fix warning: oldval may be used uninitialized in this function Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 025/106] powerpc/pseries/mobility: use cond_resched when updating device tree Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 026/106] pinctrl: tegra: Fix write barrier placement in pmx_writel Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 027/106] powerpc/eeh: Clear stale EEH_DEV_NO_HANDLER flag Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 028/106] vfio_pci: Restore original state on release Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 029/106] drm/nouveau/volt: Fix for some cards having 0 maximum voltage Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 030/106] pinctrl: amd: disable spurious-firing GPIO IRQs Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 031/106] clk: renesas: mstp: Set GENPD_FLAG_ALWAYS_ON for clock domain Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 032/106] clk: renesas: cpg-mssr: " Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 033/106] drm/amd/display: support spdif Greg Kroah-Hartman
2019-10-08 13:37   ` Pavel Machek
2019-10-08 15:17     ` Greg Kroah-Hartman
2019-10-08 15:32     ` Lakha, Bhawanpreet
2019-10-06 17:20 ` [PATCH 4.19 034/106] drm/amdgpu/si: fix ASIC tests Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 035/106] powerpc/64s/exception: machine check use correct cfar for late handler Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 036/106] pstore: fs superblock limits Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 037/106] clk: qcom: gcc-sdm845: Use floor ops for sdcc clks Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 038/106] powerpc/pseries: correctly track irq state in default idle Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 039/106] pinctrl: meson-gxbb: Fix wrong pinning definition for uart_c Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 040/106] arm64: fix unreachable code issue with cmpxchg Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 041/106] clk: at91: select parent if main oscillator or bypass is enabled Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 042/106] powerpc: dump kernel log before carrying out fadump or kdump Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 043/106] mbox: qcom: add APCS child device for QCS404 Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 044/106] clk: sprd: add missing kfree Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 045/106] scsi: core: Reduce memory required for SCSI logging Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 046/106] dma-buf/sw_sync: Synchronize signal vs syncpt free Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 047/106] ext4: fix potential use after free after remounting with noblock_validity Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 048/106] MIPS: Ingenic: Disable broken BTB lookup optimization Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 049/106] MIPS: tlbex: Explicitly cast _PAGE_NO_EXEC to a boolean Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 050/106] i2c-cht-wc: Fix lockdep warning Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 051/106] mfd: intel-lpss: Remove D3cold delay Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 052/106] PCI: tegra: Fix OF node reference leak Greg Kroah-Hartman
2019-10-06 17:20 ` [PATCH 4.19 053/106] HID: wacom: Fix several minor compiler warnings Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 054/106] livepatch: Nullify obj->mod in klp_module_coming()s error path Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 055/106] ARM: 8898/1: mm: Dont treat faults reported from cache maintenance as writes Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 056/106] soundwire: intel: fix channel number reported by hardware Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 057/106] ARM: 8875/1: Kconfig: default to AEABI w/ Clang Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 058/106] rtc: snvs: fix possible race condition Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 059/106] rtc: pcf85363/pcf85263: fix regmap error in set_time Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 060/106] HID: apple: Fix stuck function keys when using FN Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 061/106] PCI: rockchip: Propagate errors for optional regulators Greg Kroah-Hartman
2019-10-06 17:21   ` Greg Kroah-Hartman
2019-10-08 12:24   ` Pavel Machek
2019-10-06 17:21 ` [PATCH 4.19 062/106] PCI: histb: " Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 063/106] PCI: imx6: " Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 064/106] PCI: exynos: Propagate errors for optional PHYs Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 065/106] security: smack: Fix possible null-pointer dereferences in smack_socket_sock_rcv_skb() Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 066/106] ARM: 8903/1: ensure that usable memory in bank 0 starts from a PMD-aligned address Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 067/106] fat: work around race with userspaces read via blockdev while mounting Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 068/106] pktcdvd: remove warning on attempting to register non-passthrough dev Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 069/106] hypfs: Fix error number left in struct pointer member Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 070/106] crypto: hisilicon - Fix double free in sec_free_hw_sgl() Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 071/106] kbuild: clean compressed initramfs image Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 072/106] ocfs2: wait for recovering done after direct unlock request Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 073/106] kmemleak: increase DEBUG_KMEMLEAK_EARLY_LOG_SIZE default to 16K Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 074/106] arm64: consider stack randomization for mmap base only when necessary Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 075/106] mips: properly account for stack randomization and stack guard gap Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 076/106] arm: " Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 077/106] arm: use STACK_TOP when computing mmap base address Greg Kroah-Hartman
2019-10-06 17:21 ` Greg Kroah-Hartman [this message]
2019-10-06 17:21 ` [PATCH 4.19 079/106] bpf: fix use after free in prog symbol exposure Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 080/106] cxgb4:Fix out-of-bounds MSI-X info array access Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 081/106] erspan: remove the incorrect mtu limit for erspan Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 082/106] hso: fix NULL-deref on tty open Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 083/106] ipv6: drop incoming packets having a v4mapped source address Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 084/106] ipv6: Handle missing host route in __ipv6_ifa_notify Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 085/106] net: ipv4: avoid mixed n_redirects and rate_tokens usage Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 086/106] net: qlogic: Fix memory leak in ql_alloc_large_buffers Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 087/106] net: Unpublish sk from sk_reuseport_cb before call_rcu Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 088/106] nfc: fix memory leak in llcp_sock_bind() Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 089/106] qmi_wwan: add support for Cinterion CLS8 devices Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 090/106] rxrpc: Fix rxrpc_recvmsg tracepoint Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 091/106] sch_dsmark: fix potential NULL deref in dsmark_init() Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 092/106] udp: fix gso_segs calculations Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 093/106] vsock: Fix a lockdep warning in __vsock_release() Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 094/106] net: dsa: rtl8366: Check VLAN ID and not ports Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 095/106] udp: only do GSO if # of segs > 1 Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 096/106] net/rds: Fix error handling in rds_ib_add_one() Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 097/106] xen-netfront: do not use ~0U as error return value for xennet_fill_frags() Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 098/106] tipc: fix unlimited bundling of small messages Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 099/106] sch_cbq: validate TCA_CBQ_WRROPT to avoid crash Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 100/106] soundwire: Kconfig: fix help format Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 101/106] soundwire: fix regmap dependencies and align with other serial links Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 102/106] Smack: Dont ignore other bprm->unsafe flags if LSM_UNSAFE_PTRACE is set Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 103/106] smack: use GFP_NOFS while holding inode_smack::smk_lock Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 104/106] NFC: fix attrs checks in netlink interface Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 105/106] kexec: bail out upon SIGKILL when allocating memory Greg Kroah-Hartman
2019-10-06 17:21 ` [PATCH 4.19 106/106] 9p/cache.c: Fix memory leak in v9fs_cache_session_get_cookie Greg Kroah-Hartman
2019-10-07  0:01 ` [PATCH 4.19 000/106] 4.19.78-stable review kernelci.org bot
2019-10-07 10:08 ` Jon Hunter
2019-10-07 10:08   ` Jon Hunter
2019-10-07 14:32 ` Guenter Roeck
2019-10-07 16:17 ` Daniel Díaz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191006171156.638428126@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=Hans.Holmberg@wdc.com \
    --cc=axboe@kernel.dk \
    --cc=damien.lemoal@wdc.com \
    --cc=hch@lst.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.