From: Coly Li <colyli@suse.de>
To: Mingzhe Zou <mingzhe.zou@easystack.cn>
Cc: Eric Wheeler <bcache@lists.ewheeler.net>,
Bcache Linux <linux-bcache@vger.kernel.org>,
zoumingzhe@qq.com
Subject: Re: [PATCH] bcache: fixup multi-threaded bch_sectors_dirty_init() wake-up race
Date: Fri, 20 Oct 2023 17:23:25 +0800 [thread overview]
Message-ID: <44D56265-0F0C-4FAF-A9A0-7BA97D41C84F@suse.de> (raw)
In-Reply-To: <20231020075051.261222-1-mingzhe.zou@easystack.cn>
> 2023年10月20日 15:50,Mingzhe Zou <mingzhe.zou@easystack.cn> 写道:
>
> We get a kernel crash about "unable to handle kernel paging request":
>
> ```dmesg
> [368033.032005] BUG: unable to handle kernel paging request at ffffffffad9ae4b5
> [368033.032007] PGD fc3a0d067 P4D fc3a0d067 PUD fc3a0e063 PMD 8000000fc38000e1
> [368033.032012] Oops: 0003 [#1] SMP PTI
> [368033.032015] CPU: 23 PID: 55090 Comm: bch_dirtcnt[0] Kdump: loaded Tainted: G OE --------- - - 4.18.0-147.5.1.es8_24.x86_64 #1
> [368033.032017] Hardware name: Tsinghua Tongfang THTF Chaoqiang Server/072T6D, BIOS 2.4.3 01/17/2017
> [368033.032027] RIP: 0010:native_queued_spin_lock_slowpath+0x183/0x1d0
> [368033.032029] Code: 8b 02 48 85 c0 74 f6 48 89 c1 eb d0 c1 e9 12 83 e0
> 03 83 e9 01 48 c1 e0 05 48 63 c9 48 05 c0 3d 02 00 48 03 04 cd 60 68 93
> ad <48> 89 10 8b 42 08 85 c0 75 09 f3 90 8b 42 08 85 c0 74 f7 48 8b 02
> [368033.032031] RSP: 0018:ffffbb48852abe00 EFLAGS: 00010082
> [368033.032032] RAX: ffffffffad9ae4b5 RBX: 0000000000000246 RCX: 0000000000003bf3
> [368033.032033] RDX: ffff97b0ff8e3dc0 RSI: 0000000000600000 RDI: ffffbb4884743c68
> [368033.032034] RBP: 0000000000000001 R08: 0000000000000000 R09: 000007ffffffffff
> [368033.032035] R10: ffffbb486bb01000 R11: 0000000000000001 R12: ffffffffc068da70
> [368033.032036] R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000000000
> [368033.032038] FS: 0000000000000000(0000) GS:ffff97b0ff8c0000(0000) knlGS:0000000000000000
> [368033.032039] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [368033.032040] CR2: ffffffffad9ae4b5 CR3: 0000000fc3a0a002 CR4: 00000000003626e0
> [368033.032042] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [368033.032043] bcache: bch_cached_dev_attach() Caching rbd479 as bcache462 on set 8cff3c36-4a76-4242-afaa-7630206bc70b
> [368033.032045] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [368033.032046] Call Trace:
> [368033.032054] _raw_spin_lock_irqsave+0x32/0x40
> [368033.032061] __wake_up_common_lock+0x63/0xc0
> [368033.032073] ? bch_ptr_invalid+0x10/0x10 [bcache]
> [368033.033502] bch_dirty_init_thread+0x14c/0x160 [bcache]
> [368033.033511] ? read_dirty_submit+0x60/0x60 [bcache]
> [368033.033516] kthread+0x112/0x130
> [368033.033520] ? kthread_flush_work_fn+0x10/0x10
> [368033.034505] ret_from_fork+0x35/0x40
> ```
>
> The crash occurred when call wake_up(&state->wait), and then we want
> to look at the value in the state. However, bch_sectors_dirty_init()
> is not found in the stack of any task. Since state is allocated on
> the stack, we guess that bch_sectors_dirty_init() has exited, causing
> bch_dirty_init_thread() to be unable to handle kernel paging request.
>
> In order to verify this idea, we added some printing information during
> wake_up(&state->wait). We find that "wake up" is printed twice, however
> we only expect the last thread to wake up once.
>
> ```dmesg
> [ 994.641004] alcache: bch_dirty_init_thread() wake up
> [ 994.641018] alcache: bch_dirty_init_thread() wake up
> [ 994.641523] alcache: bch_sectors_dirty_init() init exit
> ```
>
> There is a race. If bch_sectors_dirty_init() exits after the first wake
> up, the second wake up will trigger this bug("unable to handle kernel
> paging request").
>
> Proceed as follows:
>
> bch_sectors_dirty_init
> kthread_run ==============> bch_dirty_init_thread(bch_dirtcnt[0])
> ... ...
> atomic_inc(&state.started) ...
> ... ...
> atomic_read(&state.enough) ...
> ... atomic_set(&state->enough, 1)
> kthread_run ======================================================> bch_dirty_init_thread(bch_dirtcnt[1])
> ... atomic_dec_and_test(&state->started) ...
> atomic_inc(&state.started) ... ...
> ... wake_up(&state->wait) ...
> atomic_read(&state.enough) atomic_dec_and_test(&state->started)
> ... ...
> wait_event(state.wait, atomic_read(&state.started) == 0) ...
> return ...
> wake_up(&state->wait)
>
> We believe it is very common to wake up twice if there is no dirty, but
> crash is an extremely low probability event. It's hard for us to reproduce
> this issue. We attached and detached continuously for a week, with a total
> of more than one million attaches and only one crash.
>
> Putting atomic_inc(&state.started) before kthread_run() can avoid waking
> up twice.
>
> Fixes: b144e45fc576 ("bcache: make bch_sectors_dirty_init() to be multithreaded")
> Signed-off-by: Mingzhe Zou <mingzhe.zou@easystack.cn>
> Cc: stable@vger.kernel.org
Thanks for catching this. Added to my for-next.
Coly Li
> ---
> drivers/md/bcache/writeback.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c
> index 24c049067f61..a6ddd0bb9220 100644
> --- a/drivers/md/bcache/writeback.c
> +++ b/drivers/md/bcache/writeback.c
> @@ -1014,17 +1014,18 @@ void bch_sectors_dirty_init(struct bcache_device *d)
> if (atomic_read(&state.enough))
> break;
>
> + atomic_inc(&state.started);
> state.infos[i].state = &state;
> state.infos[i].thread =
> kthread_run(bch_dirty_init_thread, &state.infos[i],
> "bch_dirtcnt[%d]", i);
> if (IS_ERR(state.infos[i].thread)) {
> pr_err("fails to run thread bch_dirty_init[%d]\n", i);
> + atomic_dec(&state.started);
> for (--i; i >= 0; i--)
> kthread_stop(state.infos[i].thread);
> goto out;
> }
> - atomic_inc(&state.started);
> }
>
> out:
> --
> 2.17.1.windows.2
>
prev parent reply other threads:[~2023-10-20 9:23 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-20 7:50 [PATCH] bcache: fixup multi-threaded bch_sectors_dirty_init() wake-up race Mingzhe Zou
2023-10-20 9:23 ` Coly Li [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=44D56265-0F0C-4FAF-A9A0-7BA97D41C84F@suse.de \
--to=colyli@suse.de \
--cc=bcache@lists.ewheeler.net \
--cc=linux-bcache@vger.kernel.org \
--cc=mingzhe.zou@easystack.cn \
--cc=zoumingzhe@qq.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).