linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: Yu Kuai <yukuai1@huaweicloud.com>
Cc: hch@infradead.org, josef@toxicpanda.com, axboe@kernel.dk,
	cgroups@vger.kernel.org, linux-block@vger.kernel.org,
	linux-kernel@vger.kernel.org, yi.zhang@huawei.com,
	yangerkun@huawei.com, "yukuai (C)" <yukuai3@huawei.com>
Subject: Re: [PATCH -next 3/4] block/rq_qos: use a global mutex to protect rq_qos apis
Date: Fri, 6 Jan 2023 08:23:27 -1000	[thread overview]
Message-ID: <Y7hnH9GT6D469Vuu@slm.duckdns.org> (raw)
In-Reply-To: <ef55a0f1-d3c2-3979-963e-2fa10ba3c2ff@huaweicloud.com>

Hello,

On Fri, Jan 06, 2023 at 09:33:26AM +0800, Yu Kuai wrote:
> > wbt's lazy init is tied to one of the block device sysfs files, right? So,
> > it *should* already be protected against device removal.
> 
> That seems not true, I don't think q->sysfs_lock can protect that,
> consider that queue_wb_lat_store() doesn't check if del_gendisk() is
> called or not:
> 
> t1: wbt lazy init		t2: remove device
> queue_attr_store
> 				del_gendisk
> 				blk_unregister_queue
> 				 mutex_lock(&q->sysfs_lock)
> 			         ...
> 				 mutex_unlock(&q->sysfs_lock);
> 				rq_qos_exit
>  mutex_lock(&q->sysfs_lock);
>   queue_wb_lat_store
>   wbt_init
>    rq_qos_add
>  mutex_unlock(&q->sysfs_lock);

So, it's not sysfs_lock but sysfs file deletion. When a kernfs, which backs
sysfs, file is removed, it disables future operations and drains all
inflight ones before returning, so you remove the interface files before
cleaning up the object that it interacts with, you don't have to worry about
racing against file operations as none can be in flight at that point.

> I tried to comfirm that by adding following delay:
> 
> diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
> index 93d9e9c9a6ea..101c33cb0a2b 100644
> --- a/block/blk-sysfs.c
> +++ b/block/blk-sysfs.c
> @@ -11,6 +11,7 @@
>  #include <linux/blktrace_api.h>
>  #include <linux/blk-mq.h>
>  #include <linux/debugfs.h>
> +#include <linux/delay.h>
> 
>  #include "blk.h"
>  #include "blk-mq.h"
> @@ -734,6 +735,8 @@ queue_attr_store(struct kobject *kobj, struct attribute
> *attr,
>         if (!entry->store)
>                 return -EIO;
> 
> +       msleep(10000);
> +
>         mutex_lock(&q->sysfs_lock);
>         res = entry->store(q, page, length);
>         mutex_unlock(&q->sysfs_lock);
> 
> And then do the following test:
> 
> 1) echo 10000 > /sys/block/sdb/queue/wbt_lat_usec &
> 2) echo 1 > /sys/block/sda/device/delete
> 
> Then, following bug is triggered:
> 
> [   51.923642] BUG: unable to handle page fault for address:
> ffffffffffffffed
> [   51.924294] #PF: supervisor read access in kernel mode
> [   51.924773] #PF: error_code(0x0000) - not-present page
> [   51.925252] PGD 1820b067 P4D 1820b067 PUD 1820d067 PMD 0
> [   51.925754] Oops: 0000 [#1] PREEMPT SMP
> [   51.926123] CPU: 1 PID: 539 Comm: bash Tainted: G        W
> 6.2.0-rc1-next-202212267
> [   51.927124] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> ?-20190727_073836-b4
> [   51.928334] RIP: 0010:__rq_qos_issue+0x30/0x60

This indicates that we aren't getting the destruction order right. It could
be that there are other reasons why the ordering is like this and we might
have to synchronize separately.

Sorry that I've been asking you to go round and round but block device
add/remove paths have always been really tricky and we wanna avoid adding
more complications if at all possible. Can you see why the device is being
destroyed before the queue attr is removed?

Thanks.

-- 
tejun

  reply	other threads:[~2023-01-06 18:23 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-04  8:53 [PATCH -next 0/4] block/rq_qos: protect rq_qos apis with global mutex Yu Kuai
2023-01-04  8:53 ` [PATCH -next 1/4] block/rq_qos: move implementions of init/exit rq-qos apis to blk-rq-qos.c Yu Kuai
2023-01-04 21:30   ` Tejun Heo
2023-01-04  8:53 ` [PATCH -next 2/4] block/rq_qos: factor out a helper to add rq_qos and activate policy Yu Kuai
2023-01-04  8:53 ` [PATCH -next 3/4] block/rq_qos: use a global mutex to protect rq_qos apis Yu Kuai
2023-01-04 21:39   ` Tejun Heo
2023-01-05  0:28     ` Tejun Heo
2023-01-05  1:35       ` Yu Kuai
2023-01-05 18:34         ` Tejun Heo
2023-01-06  1:33           ` Yu Kuai
2023-01-06 18:23             ` Tejun Heo [this message]
2023-01-09  1:38               ` Yu Kuai
2023-01-09  6:37                 ` Yu Kuai
2023-01-04  8:53 ` [PATCH -next 4/4] block/rq_qos: fail rq_qos_add() after rq_qos_exit() Yu Kuai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y7hnH9GT6D469Vuu@slm.duckdns.org \
    --to=tj@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=cgroups@vger.kernel.org \
    --cc=hch@infradead.org \
    --cc=josef@toxicpanda.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=yangerkun@huawei.com \
    --cc=yi.zhang@huawei.com \
    --cc=yukuai1@huaweicloud.com \
    --cc=yukuai3@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).