From: Coly Li <colyli@suse.de>
To: Yi Li <yili@winhong.com>
Cc: yilikernel@gmail.com, kent.overstreet@gmail.com,
linux-bcache@vger.kernel.org, linux-kernel@vger.kernel.org,
Guo Chao <guochao@winhong.com>
Subject: Re: [PATCH v2] bcache: fix panic due to cache_set is null
Date: Thu, 3 Dec 2020 19:27:45 +0800 [thread overview]
Message-ID: <d25c094a-fcbc-c4da-3bf5-72549578e653@suse.de> (raw)
In-Reply-To: <20201203094711.3236551-1-yili@winhong.com>
On 12/3/20 5:47 PM, Yi Li wrote:
> bcache_device_detach will release the cache_set after hotunplug cache
> disk.
>
> Here is how the issue happens.
> 1) cached_dev_free do cancel_writeback_rate_update_dwork
> without bch_register_lock.
> 2) Wirting the writeback_percent by sysfs with
> bch_register_lock will insert a writeback_rate_update work.
> 3) cached_dev_free with bch_register_lock to do bcache_device_free.
> dc->disk.cl will be set NULL
> 4) update_writeback_rate will crash when access dc->disk.cl
The analysis makes sense, good catch! Thank you for make me understand
the problem.
>
> Fixes: 80265d8dfd77 ("bcache: acquire bch_register_lock later in cached_dev_free()")
>
> IP: [<ffffffffa03730c9>] update_writeback_rate+0x59/0x3a0 [bcache]
> PGD 879620067 PUD 8755d3067 PMD 0
> Oops: 0000 [#1] SMP
> CPU: 8 PID: 1005702 Comm: kworker/8:0 Tainted: G 4.4.0+10 #1
> Hardware name: Intel BIOS SE5C610.86B.01.01.0021.032120170601 03/21/2017
> Workqueue: events update_writeback_rate [bcache]
> task: ffff8808786f3800 ti: ffff88077082c000 task.ti: ffff88077082c000
> RIP: e030:[<ffffffffa03730c9>] update_writeback_rate+0x59/0x3a0 [bcache]
> RSP: e02b:ffff88077082fde0 EFLAGS: 00010202
> RAX: 0000000000000018 RBX: ffff8808047f0b08 RCX: 0000000000000000
> RDX: 0000000000000001 RSI: ffff88088170dab8 RDI: ffff88088170dab8
> RBP: ffff88077082fe18 R08: 000000000000000a R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000017bc8 R12: 0000000000000000
> R13: ffff8808047f0000 R14: 0000000000000200 R15: ffff8808047f0b08
> FS: 00007f157b6d6700(0000) GS:ffff880881700000(0000) knlGS:0000000000000000
> CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000368 CR3: 0000000875c05000 CR4: 0000000000040660
> Stack:
> 0000000000000001 0000000000007ff0 ffff88085ff600c0 ffff880881714e80
> ffff880881719500 0000000000000200 ffff8808047f0b08 ffff88077082fe60
> ffffffff81088c0c 0000000081714e80 0000000000000000 ffff880881714e80
> Call Trace:
> [<ffffffff81088c0c>] process_one_work+0x1fc/0x3b0
> [<ffffffff81089575>] worker_thread+0x2a5/0x470
> [<ffffffff815a2f58>] ? __schedule+0x648/0x870
> [<ffffffff810892d0>] ? rescuer_thread+0x300/0x300
> [<ffffffff8108e3d5>] kthread+0xd5/0xe0
> [<ffffffff8108e300>] ? kthread_stop+0x110/0x110
> [<ffffffff815a704f>] ret_from_fork+0x3f/0x70
> [<ffffffff8108e300>] ? kthread_stop+0x110/0x110
>
> Reported-by: Guo Chao <guochao@winhong.com>
> Signed-off-by: Yi Li <yili@winhong.com>
> ---
> drivers/md/bcache/super.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
> index 46a00134a36a..8b341f756ac0 100644
> --- a/drivers/md/bcache/super.c
> +++ b/drivers/md/bcache/super.c
> @@ -1334,9 +1334,6 @@ static void cached_dev_free(struct closure *cl)
> {
> struct cached_dev *dc = container_of(cl, struct cached_dev, disk.cl);
>
> - if (test_and_clear_bit(BCACHE_DEV_WB_RUNNING, &dc->disk.flags))
> - cancel_writeback_rate_update_dwork(dc);
> -
> if (!IS_ERR_OR_NULL(dc->writeback_thread))
> kthread_stop(dc->writeback_thread);
> if (!IS_ERR_OR_NULL(dc->status_update_thread))
> @@ -1344,6 +1341,9 @@ static void cached_dev_free(struct closure *cl)
>
> mutex_lock(&bch_register_lock);
>
> + if (test_and_clear_bit(BCACHE_DEV_WB_RUNNING, &dc->disk.flags))
> + cancel_writeback_rate_update_dwork(dc);
> +
> if (atomic_read(&dc->running))
> bd_unlink_disk_holder(dc->bdev, dc->disk.disk);
> bcache_device_free(&dc->disk);
>
Such change is problematic, the writeback rate kworker mush stopped
before writeback and status_update thread, otherwise you may encounter
other problem.
And when I review your patch I find another similar potential problem.
This is tricky, let me think how to fix it ....
Thank you again, for catch such issue.
Coly Li
next prev parent reply other threads:[~2020-12-03 11:28 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-30 11:21 [PATCH] bcache: fix panic due to cache_set is null Yi Li
2020-12-01 4:35 ` Yi Li
2020-12-01 16:42 ` Coly Li
2020-12-03 6:25 ` Yi Li
2020-12-03 6:56 ` Coly Li
2020-12-03 9:06 ` Yi Li
2020-12-03 9:47 ` [PATCH v2] " Yi Li
2020-12-03 11:27 ` Coly Li [this message]
2020-12-04 1:33 ` Yi Li
2020-12-04 5:54 ` [PATCH v3] " Yi Li
2020-12-01 16:38 ` [PATCH] " Coly Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d25c094a-fcbc-c4da-3bf5-72549578e653@suse.de \
--to=colyli@suse.de \
--cc=guochao@winhong.com \
--cc=kent.overstreet@gmail.com \
--cc=linux-bcache@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=yili@winhong.com \
--cc=yilikernel@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).