linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Ming Lei <ming.lei@redhat.com>, Yi Zhang <yi.zhang@redhat.com>,
	Bart Van Assche <bvanassche@acm.org>,
	Jens Axboe <axboe@kernel.dk>, Sasha Levin <sashal@kernel.org>,
	linux-block@vger.kernel.org
Subject: [PATCH AUTOSEL 5.4 14/26] block: fix race between adding/removing rq qos and normal IO
Date: Mon,  5 Jul 2021 11:30:27 -0400	[thread overview]
Message-ID: <20210705153039.1521781-14-sashal@kernel.org> (raw)
In-Reply-To: <20210705153039.1521781-1-sashal@kernel.org>

From: Ming Lei <ming.lei@redhat.com>

[ Upstream commit 2cafe29a8d03f02a3d16193bdaae2f3e82a423f9 ]

Yi reported several kernel panics on:

[16687.001777] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
...
[16687.163549] pc : __rq_qos_track+0x38/0x60

or

[  997.690455] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020
...
[  997.850347] pc : __rq_qos_done+0x2c/0x50

Turns out it is caused by race between adding rq qos(wbt) and normal IO
because rq_qos_add can be run when IO is being submitted, fix this issue
by freezing queue before adding/deleting rq qos to queue.

rq_qos_exit() needn't to freeze queue because it is called after queue
has been frozen.

iolatency calls rq_qos_add() during allocating queue, so freezing won't
add delay because queue usage refcount works at atomic mode at that
time.

iocost calls rq_qos_add() when writing cgroup attribute file, that is
fine to freeze queue at that time since we usually freeze queue when
storing to queue sysfs attribute, meantime iocost only exists on the
root cgroup.

wbt_init calls it in blk_register_queue() and queue sysfs attribute
store(queue_wb_lat_store() when write it 1st time in case of !BLK_WBT_MQ),
the following patch will speedup the queue freezing in wbt_init.

Reported-by: Yi Zhang <yi.zhang@redhat.com>
Cc: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Tested-by: Yi Zhang <yi.zhang@redhat.com>
Link: https://lore.kernel.org/r/20210609015822.103433-2-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 block/blk-rq-qos.h | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/block/blk-rq-qos.h b/block/blk-rq-qos.h
index 2bc43e94f4c4..2bcb3495e376 100644
--- a/block/blk-rq-qos.h
+++ b/block/blk-rq-qos.h
@@ -7,6 +7,7 @@
 #include <linux/blk_types.h>
 #include <linux/atomic.h>
 #include <linux/wait.h>
+#include <linux/blk-mq.h>
 
 #include "blk-mq-debugfs.h"
 
@@ -99,8 +100,21 @@ static inline void rq_wait_init(struct rq_wait *rq_wait)
 
 static inline void rq_qos_add(struct request_queue *q, struct rq_qos *rqos)
 {
+	/*
+	 * No IO can be in-flight when adding rqos, so freeze queue, which
+	 * is fine since we only support rq_qos for blk-mq queue.
+	 *
+	 * Reuse ->queue_lock for protecting against other concurrent
+	 * rq_qos adding/deleting
+	 */
+	blk_mq_freeze_queue(q);
+
+	spin_lock_irq(&q->queue_lock);
 	rqos->next = q->rq_qos;
 	q->rq_qos = rqos;
+	spin_unlock_irq(&q->queue_lock);
+
+	blk_mq_unfreeze_queue(q);
 
 	if (rqos->ops->debugfs_attrs)
 		blk_mq_debugfs_register_rqos(rqos);
@@ -110,12 +124,22 @@ static inline void rq_qos_del(struct request_queue *q, struct rq_qos *rqos)
 {
 	struct rq_qos **cur;
 
+	/*
+	 * See comment in rq_qos_add() about freezing queue & using
+	 * ->queue_lock.
+	 */
+	blk_mq_freeze_queue(q);
+
+	spin_lock_irq(&q->queue_lock);
 	for (cur = &q->rq_qos; *cur; cur = &(*cur)->next) {
 		if (*cur == rqos) {
 			*cur = rqos->next;
 			break;
 		}
 	}
+	spin_unlock_irq(&q->queue_lock);
+
+	blk_mq_unfreeze_queue(q);
 
 	blk_mq_debugfs_unregister_rqos(rqos);
 }
-- 
2.30.2


  parent reply	other threads:[~2021-07-05 15:33 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-05 15:30 [PATCH AUTOSEL 5.4 01/26] HID: do not use down_interruptible() when unbinding devices Sasha Levin
2021-07-05 15:30 ` [PATCH AUTOSEL 5.4 02/26] EDAC/ti: Add missing MODULE_DEVICE_TABLE Sasha Levin
2021-07-05 15:30 ` [PATCH AUTOSEL 5.4 03/26] ACPI: processor idle: Fix up C-state latency if not ordered Sasha Levin
2021-07-05 15:30 ` [PATCH AUTOSEL 5.4 04/26] hv_utils: Fix passing zero to 'PTR_ERR' warning Sasha Levin
2021-07-05 15:30 ` [PATCH AUTOSEL 5.4 05/26] lib: vsprintf: Fix handling of number field widths in vsscanf Sasha Levin
2021-07-05 15:30 ` [PATCH AUTOSEL 5.4 06/26] ACPI: EC: Make more Asus laptops use ECDT _GPE Sasha Levin
2021-07-05 15:30 ` [PATCH AUTOSEL 5.4 07/26] block_dump: remove block_dump feature in mark_inode_dirty() Sasha Levin
2021-07-05 15:30 ` [PATCH AUTOSEL 5.4 08/26] fs: dlm: cancel work sync othercon Sasha Levin
2021-07-05 15:30 ` [PATCH AUTOSEL 5.4 09/26] random32: Fix implicit truncation warning in prandom_seed_state() Sasha Levin
2021-07-05 15:30 ` [PATCH AUTOSEL 5.4 10/26] fs: dlm: fix memory leak when fenced Sasha Levin
2021-07-05 15:30 ` [PATCH AUTOSEL 5.4 11/26] ACPICA: Fix memory leak caused by _CID repair function Sasha Levin
2021-07-05 15:30 ` [PATCH AUTOSEL 5.4 12/26] ACPI: bus: Call kobject_put() in acpi_init() error path Sasha Levin
2021-07-05 15:30 ` [PATCH AUTOSEL 5.4 13/26] ACPI: resources: Add checks for ACPI IRQ override Sasha Levin
2021-07-05 15:30 ` Sasha Levin [this message]
2021-07-05 15:30 ` [PATCH AUTOSEL 5.4 15/26] platform/x86: toshiba_acpi: Fix missing error code in toshiba_acpi_setup_keyboard() Sasha Levin
2021-07-05 15:30 ` [PATCH AUTOSEL 5.4 16/26] nvmet-fc: do not check for invalid target port in nvmet_fc_handle_fcp_rqst() Sasha Levin
2021-07-05 15:30 ` [PATCH AUTOSEL 5.4 17/26] EDAC/Intel: Do not load EDAC driver when running as a guest Sasha Levin
2021-07-05 15:30 ` [PATCH AUTOSEL 5.4 18/26] PCI: hv: Add check for hyperv_initialized in init_hv_pci_drv() Sasha Levin
2021-07-05 15:30 ` [PATCH AUTOSEL 5.4 19/26] clocksource: Retry clock read if long delays detected Sasha Levin
2021-07-05 15:30 ` [PATCH AUTOSEL 5.4 20/26] ACPI: tables: Add custom DSDT file as makefile prerequisite Sasha Levin
2021-07-05 15:30 ` [PATCH AUTOSEL 5.4 21/26] HID: wacom: Correct base usage for capacitive ExpressKey status bits Sasha Levin
2021-07-05 15:30 ` [PATCH AUTOSEL 5.4 22/26] cifs: fix missing spinlock around update to ses->status Sasha Levin
2021-07-05 15:30 ` [PATCH AUTOSEL 5.4 23/26] block: fix discard request merge Sasha Levin
2021-07-05 15:30 ` [PATCH AUTOSEL 5.4 24/26] kthread_worker: fix return value when kthread_mod_delayed_work() races with kthread_cancel_delayed_work_sync() Sasha Levin
2021-07-05 15:30 ` [PATCH AUTOSEL 5.4 25/26] ia64: mca_drv: fix incorrect array size calculation Sasha Levin
2021-07-05 15:30 ` [PATCH AUTOSEL 5.4 26/26] writeback, cgroup: increment isw_nr_in_flight before grabbing an inode Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210705153039.1521781-14-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=bvanassche@acm.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=stable@vger.kernel.org \
    --cc=yi.zhang@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).