From: Gao Xiang <hsiangkao@aol.com>
To: Dave Chinner <david@fromorbit.com>
Cc: "Darrick J. Wong" <darrick.wong@oracle.com>,
xfs <linux-xfs@vger.kernel.org>,
tj@kernel.org, linux-fsdevel@vger.kernel.org,
linux-block@vger.kernel.org
Subject: Re: [5.4-rc1, regression] wb_workfn wakeup oops (was Re: frequent 5.4-rc1 crash?)
Date: Thu, 3 Oct 2019 16:41:56 +0800 [thread overview]
Message-ID: <20191003084149.GA16347@hsiangkao-HP-ZHAN-66-Pro-G1> (raw)
In-Reply-To: <20191003064022.GX16973@dread.disaster.area>
Hi,
On Thu, Oct 03, 2019 at 04:40:22PM +1000, Dave Chinner wrote:
> [cc linux-fsdevel, linux-block, tejun ]
>
> On Wed, Oct 02, 2019 at 06:52:47PM -0700, Darrick J. Wong wrote:
> > Hi everyone,
> >
> > Does anyone /else/ see this crash in generic/299 on a V4 filesystem (tho
> > afaict V5 configs crash too) and a 5.4-rc1 kernel? It seems to pop up
> > on generic/299 though only 80% of the time.
> >
> > --D
> >
> > [ 1806.186197] run fstests generic/299 at 2019-10-02 18:15:30
> > [ 1808.279874] XFS (sdb): Mounting V4 Filesystem
> > [ 1808.283519] XFS (sdb): Ending clean mount
> > [ 1808.284530] XFS (sdb): Quotacheck needed: Please wait.
> > [ 1808.317062] XFS (sdb): Quotacheck: Done.
> > [ 1808.319821] Mounted xfs file system at /opt supports timestamps until 2038 (0x7fffffff)
> > [ 1886.218794] BUG: kernel NULL pointer dereference, address: 0000000000000018
> > [ 1886.219787] #PF: supervisor read access in kernel mode
> > [ 1886.220638] #PF: error_code(0x0000) - not-present page
> > [ 1886.221496] PGD 0 P4D 0
> > [ 1886.221970] Oops: 0000 [#1] PREEMPT SMP
> > [ 1886.222596] CPU: 2 PID: 227320 Comm: kworker/u10:2 Tainted: G W 5.4.0-rc1-djw #rc1
> > [ 1886.224016] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-1ubuntu1 04/01/2014
> > [ 1886.225261] Workqueue: writeback wb_workfn (flush-8:16)
> > [ 1886.225926] RIP: 0010:__lock_acquire+0x4c3/0x1490
> > [ 1886.226595] Code: 00 00 00 48 8b 74 24 48 65 48 33 34 25 28 00 00 00 8b 44 24 04 0f 85 a1 0f 00 00 48 83 c4 50 5b 5d 41 5c 41 5d 41 5e 41 5f c3 <48> 81 3f 20 66 6d 82 41 ba 00 00 00 00 45 0f 45 d0 83 fe 01 0f 87
> > [ 1886.229146] RSP: 0000:ffffc900052c3bc0 EFLAGS: 00010002
> > [ 1886.230008] RAX: 0000000000000000 RBX: 0000000000000018 RCX: 0000000000000000
> > [ 1886.231238] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000018
> > [ 1886.236382] RBP: ffff888077f80000 R08: 0000000000000001 R09: 0000000000000001
> > [ 1886.241630] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
> > [ 1886.243530] R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000018
> > [ 1886.244669] FS: 0000000000000000(0000) GS:ffff88807e000000(0000) knlGS:0000000000000000
> > [ 1886.245941] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 1886.246913] CR2: 0000000000000018 CR3: 0000000072f7b003 CR4: 00000000001606a0
> > [ 1886.247834] Call Trace:
> > [ 1886.248217] ? mark_held_locks+0x47/0x70
> > [ 1886.248810] ? trace_hardirqs_on_thunk+0x1a/0x20
> > [ 1886.249445] lock_acquire+0x90/0x180
> > [ 1886.249876] ? __wake_up_common_lock+0x62/0xc0
> > [ 1886.250577] _raw_spin_lock_irqsave+0x3e/0x80
> > [ 1886.251327] ? __wake_up_common_lock+0x62/0xc0
> > [ 1886.252538] __wake_up_common_lock+0x62/0xc0
> > [ 1886.257318] wb_workfn+0x10e/0x610
> > [ 1886.260171] ? __lock_acquire+0x268/0x1490
> > [ 1886.266124] ? process_one_work+0x1da/0x5d0
> > [ 1886.266941] process_one_work+0x25b/0x5d0
> > [ 1886.267759] worker_thread+0x3d/0x3a0
> > [ 1886.268497] ? process_one_work+0x5d0/0x5d0
> > [ 1886.269285] kthread+0x121/0x140
> > [ 1886.269808] ? kthread_park+0x80/0x80
> > [ 1886.270317] ret_from_fork+0x3a/0x50
> > [ 1886.270811] Modules linked in: xfs libcrc32c dm_flakey ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 ip_set_hash_ip ip_set_hash_net xt_tcpudp bfq xt_set ip_set_hash_mac ip_set nfnetlink ip6table_filter ip6_tables iptable_filter sch_fq_codel ip_tables x_tables nfsv4 af_packet [last unloaded: scsi_debug]
> > [ 1886.274144] Dumping ftrace buffer:
> > [ 1886.274637] (ftrace buffer empty)
> > [ 1886.275129] CR2: 0000000000000018
> > [ 1886.275567] ---[ end trace 20db199015efe614 ]---
> > [ 1886.278601] RIP: 0010:__lock_acquire+0x4c3/0x1490
> > [ 1886.283408] Code: 00 00 00 48 8b 74 24 48 65 48 33 34 25 28 00 00 00 8b 44 24 04 0f 85 a1 0f 00 00 48 83 c4 50 5b 5d 41 5c 41 5d 41 5e 41 5f c3 <48> 81 3f 20 66 6d 82 41 ba 00 00 00 00 45 0f 45 d0 83 fe 01 0f 87
> >
Just a quick glance, I guess there could is a race between (complete guess):
160 static void finish_writeback_work(struct bdi_writeback *wb,
161 struct wb_writeback_work *work)
162 {
163 struct wb_completion *done = work->done;
164
165 if (work->auto_free)
166 kfree(work);
167 if (done && atomic_dec_and_test(&done->cnt))
^^^ here
168 wake_up_all(done->waitq);
169 }
since new wake_up_all(done->waitq); is completely on-stack,
if (done && atomic_dec_and_test(&done->cnt))
- wake_up_all(&wb->bdi->wb_waitq);
+ wake_up_all(done->waitq);
}
which could cause use after free if on-stack wb_completion is gone...
(however previous wb->bdi is solid since it is not on-stack)
see generic on-stack completion which takes a wait_queue spin_lock between
test and wake_up...
If I am wrong, ignore me, hmm...
Thanks,
Gao Xiang
>
> generic/270 w/ xfs-dax on pmem triggers it within 3 runs every time
> here. Bisect points to this commit:
>
> $ git bisect bad
> 5b9cce4c7eb0696558dfd4946074ae1fb9d8f05d is the first bad commit
> commit 5b9cce4c7eb0696558dfd4946074ae1fb9d8f05d
> Author: Tejun Heo <tj@kernel.org>
> Date: Mon Aug 26 09:06:52 2019 -0700
>
> writeback: Generalize and expose wb_completion
>
> wb_completion is used to track writeback completions. We want to use
> it from memcg side for foreign inode flushes. This patch updates it
> to remember the target waitq instead of assuming bdi->wb_waitq and
> expose it outside of fs-writeback.c.
>
> Reviewed-by: Jan Kara <jack@suse.cz>
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>
> fs/fs-writeback.c | 47 ++++++++++++----------------------------
> include/linux/backing-dev-defs.h | 20 +++++++++++++++++
> include/linux/backing-dev.h | 2 ++
> 3 files changed, 36 insertions(+), 33 deletions(-)
>
>
> $ git bisect log
> git bisect start
> # bad: [3a8e9ac89e6a5106cfb6b85d4c9cf9bfa3519bc7] writeback: add tracepoints for cgroup foreign writebacks
> git bisect bad 3a8e9ac89e6a5106cfb6b85d4c9cf9bfa3519bc7
> # good: [b8e24a9300b0836a9d39f6b20746766b3b81f1bd] block: annotate refault stalls from IO submission
> git bisect good b8e24a9300b0836a9d39f6b20746766b3b81f1bd
> # bad: [ed288dc0d4aa29f65bd25b31b5cb866aa5664ff9] writeback: Separate out wb_get_lookup() from wb_get_create()
> git bisect bad ed288dc0d4aa29f65bd25b31b5cb866aa5664ff9
> # good: [320ea869a12cec206756207c6ca5f817ec45c7f2] block: improve the gap check in __bio_add_pc_page
> git bisect good 320ea869a12cec206756207c6ca5f817ec45c7f2
> # good: [7ea88e229e9df18ecd624b0d39f3dba87432ba33] null_blk: create a helper for mem-backed ops
> git bisect good 7ea88e229e9df18ecd624b0d39f3dba87432ba33
> # good: [38b4e09fbccab6457536563823222921c49601bb] null_blk: fix inline misuse
> git bisect good 38b4e09fbccab6457536563823222921c49601bb
> # bad: [34f8fe501f0624de115d087680c84000b5d9abc9] bdi: Add bdi->id
> git bisect bad 34f8fe501f0624de115d087680c84000b5d9abc9
> # bad: [5b9cce4c7eb0696558dfd4946074ae1fb9d8f05d] writeback: Generalize and expose wb_completion
> git bisect bad 5b9cce4c7eb0696558dfd4946074ae1fb9d8f05d
> # first bad commit: [5b9cce4c7eb0696558dfd4946074ae1fb9d8f05d] writeback: Generalize and expose wb_completio
> $
>
> Not obvious to me what is wrong with that commit right now, but the
> bisect is solid. Kinda surprised to see such significant
> fs-writeback changes in 5.4, though, because there was nothing sent
> to the -fsdevel list for review in the last dev cycle.
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
next prev parent reply other threads:[~2019-10-03 8:42 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20191003015247.GI13108@magnolia>
2019-10-03 6:40 ` [5.4-rc1, regression] wb_workfn wakeup oops (was Re: frequent 5.4-rc1 crash?) Dave Chinner
2019-10-03 8:41 ` Gao Xiang [this message]
2019-10-03 14:01 ` Chris Mason
2019-10-03 14:05 ` Jens Axboe
2019-10-03 18:37 ` Darrick J. Wong
2019-10-06 22:30 ` Darrick J. Wong
2019-10-08 3:20 ` Jens Axboe
2019-10-03 14:40 ` Gao Xiang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191003084149.GA16347@hsiangkao-HP-ZHAN-66-Pro-G1 \
--to=hsiangkao@aol.com \
--cc=darrick.wong@oracle.com \
--cc=david@fromorbit.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).