From: Dave Jones <davej@codemonkey.org.uk>
To: Chris Mason <clm@fb.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Jens Axboe <axboe@fb.com>, Andy Lutomirski <luto@amacapital.net>,
Andy Lutomirski <luto@kernel.org>,
Al Viro <viro@zeniv.linux.org.uk>, Josef Bacik <jbacik@fb.com>,
David Sterba <dsterba@suse.com>,
linux-btrfs <linux-btrfs@vger.kernel.org>,
Linux Kernel <linux-kernel@vger.kernel.org>,
Dave Chinner <david@fromorbit.com>
Subject: Re: btrfs btree_ctree_super fault
Date: Tue, 8 Nov 2016 09:59:13 -0500 [thread overview]
Message-ID: <20161108145912.fcjvwxcpqgd7kjei@codemonkey.org.uk> (raw)
In-Reply-To: <20161106165539.ybwm6rqvzh2k6uja@codemonkey.org.uk>
On Sun, Nov 06, 2016 at 11:55:39AM -0500, Dave Jones wrote:
> <subject changed, hopefully we're done with bio corruption for now>
>
> On Mon, Oct 31, 2016 at 01:44:55PM -0600, Chris Mason wrote:
> > On Mon, Oct 31, 2016 at 12:35:16PM -0700, Linus Torvalds wrote:
> > >On Mon, Oct 31, 2016 at 11:55 AM, Dave Jones <davej@codemonkey.org.uk> wrote:
> > >>
> > >> BUG: Bad page state in process kworker/u8:12 pfn:4e0e39
> > >> page:ffffea0013838e40 count:0 mapcount:0 mapping:ffff8804a20310e0 index:0x100c
> > >> flags: 0x400000000000000c(referenced|uptodate)
> > >> page dumped because: non-NULL mapping
> > >
> > >Hmm. So this seems to be btrfs-specific, right?
> > >
> > >I searched for all your "non-NULL mapping" cases, and they all seem to
> > >have basically the same call trace, with some work thread doing
> > >writeback and going through btrfs_writepages().
> > >
> > >Sounds like it's a race with either fallocate hole-punching or
> > >truncate. I'm not seeing it, but I suspect it's btrfs, since DaveJ
> > >clearly ran other filesystems too but I am not seeing this backtrace
> > >for anything else.
> >
> > Agreed, I think this is a separate bug, almost certainly btrfs specific.
> > I'll work with Dave on a better reproducer.
>
> Still refining my 'capture ftrace when trinity detects taint' feature,
> but in the meantime, here's a variant I don't think we've seen before:
And another new one:
kernel BUG at fs/btrfs/ctree.c:3172!
invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
CPU: 0 PID: 22702 Comm: trinity-c40 Not tainted 4.9.0-rc4-think+ #1
task: ffff8804ffde37c0 task.stack: ffffc90002188000
RIP: 0010:[<ffffffffa00576b9>]
[<ffffffffa00576b9>] btrfs_set_item_key_safe+0x179/0x190 [btrfs]
RSP: 0000:ffffc9000218b8a8 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff8804fddcf348 RCX: 0000000000001000
RDX: 0000000000000000 RSI: ffffc9000218b9ce RDI: ffffc9000218b8c7
RBP: ffffc9000218b908 R08: 0000000000004000 R09: ffffc9000218b8c8
R10: 0000000000000000 R11: 0000000000000001 R12: ffffc9000218b8b6
R13: ffffc9000218b9ce R14: 0000000000000001 R15: ffff880480684a88
FS: 00007f7c7f998b40(0000) GS:ffff880507800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000044f15f000 CR4: 00000000001406f0
DR0: 00007f4ce439d000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
Stack:
ffff880501430000 d305ffffa00a2245 006c000000000002 0500000000000010
6c000000000002d3 0000000000001000 000000006427eebb ffff880480684a88
0000000000000000 ffff8804fddcf348 0000000000002000 0000000000000000
Call Trace:
[<ffffffffa009cff0>] __btrfs_drop_extents+0xb00/0xe30 [btrfs]
[<ffffffff8116c80c>] ? function_trace_call+0x13c/0x190
[<ffffffffa009c4f5>] ? __btrfs_drop_extents+0x5/0xe30 [btrfs]
[<ffffffff810e2b00>] ? do_raw_write_lock+0xb0/0xc0
[<ffffffffa00cd43d>] btrfs_log_changed_extents+0x35d/0x630 [btrfs]
[<ffffffffa00a6a74>] ? release_extent_buffer+0xa4/0x110 [btrfs]
[<ffffffffa00cd0e5>] ? btrfs_log_changed_extents+0x5/0x630 [btrfs]
[<ffffffffa00d1085>] btrfs_log_inode+0xb05/0x11d0 [btrfs]
[<ffffffff8116536c>] ? trace_function+0x6c/0x80
[<ffffffffa00d0580>] ? log_directory_changes+0xc0/0xc0 [btrfs]
[<ffffffffa00d1a20>] ? btrfs_log_inode_parent+0x240/0x940 [btrfs]
[<ffffffff8116c80c>] ? function_trace_call+0x13c/0x190
[<ffffffffa00d1a20>] btrfs_log_inode_parent+0x240/0x940 [btrfs]
[<ffffffffa00d17e5>] ? btrfs_log_inode_parent+0x5/0x940 [btrfs]
[<ffffffff81259131>] ? dget_parent+0x71/0x150
[<ffffffffa00d3102>] btrfs_log_dentry_safe+0x62/0x80 [btrfs]
[<ffffffffa009f404>] btrfs_sync_file+0x344/0x4d0 [btrfs]
[<ffffffff81278a1b>] vfs_fsync_range+0x4b/0xb0
[<ffffffff812607c5>] ? __fget_light+0x5/0x60
[<ffffffff81278add>] do_fsync+0x3d/0x70
[<ffffffff81278aa5>] ? do_fsync+0x5/0x70
[<ffffffff81278db3>] SyS_fdatasync+0x13/0x20
[<ffffffff81002d81>] do_syscall_64+0x61/0x170
[<ffffffff81894ccb>] entry_SYSCALL64_slow_path+0x25/0x25
Code: 48 8b 45 b7 48 8d 7d bf 4c 89 ee 48 89 45 c8 0f b6 45 b6 88 45 c7 48 8b 45 ae 48 89 45 bf e8 af f2 ff ff 85 c0 0f 8f 43 ff ff ff <0f> 0b 0f 0b e8 ee f3 02 e1 0f 1f 40 00 66 2e 0f 1f 84 00 00 00
Unfortunatly, because this was a BUG_ON, it locked up the box so it
didn't save any additional debug info. Tempted to see if making BUG_ON
a no-op will at least let it live long enough to save the ftrace buffer.
Given this seems to be mutating every time I see something go wrong,
I'm wondering if this is fallout from memory corruption again.
Dave
next prev parent reply other threads:[~2016-11-08 14:59 UTC|newest]
Thread overview: 118+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-10-11 14:45 btrfs bio linked list corruption Dave Jones
2016-10-11 15:11 ` Al Viro
2016-10-11 15:19 ` Dave Jones
2016-10-11 15:20 ` Chris Mason
2016-10-11 15:49 ` Dave Jones
2016-10-11 15:54 ` Chris Mason
2016-10-11 16:25 ` Dave Jones
2016-10-12 13:47 ` Dave Jones
2016-10-12 14:40 ` Dave Jones
2016-10-12 14:42 ` Chris Mason
2016-10-13 18:16 ` Dave Jones
2016-10-13 21:18 ` Chris Mason
2016-10-13 21:56 ` Dave Jones
2016-10-16 0:42 ` Dave Jones
2016-10-18 1:07 ` Chris Mason
2016-10-18 22:42 ` Dave Jones
2016-10-18 23:12 ` Jens Axboe
2016-10-18 23:31 ` Chris Mason
2016-10-18 23:36 ` Jens Axboe
2016-10-18 23:39 ` Linus Torvalds
2016-10-18 23:42 ` Chris Mason
2016-10-19 0:10 ` Linus Torvalds
2016-10-19 0:19 ` Chris Mason
2016-10-19 0:28 ` Linus Torvalds
2016-10-20 22:48 ` Dave Jones
2016-10-19 1:05 ` Andy Lutomirski
2016-10-20 22:50 ` Dave Jones
2016-10-20 23:01 ` Andy Lutomirski
2016-10-20 23:03 ` Dave Jones
2016-10-20 23:23 ` Andy Lutomirski
2016-10-21 20:02 ` Dave Jones
2016-10-21 20:17 ` Chris Mason
2016-10-21 20:23 ` Dave Jones
2016-10-21 20:38 ` Chris Mason
2016-10-21 20:41 ` Josef Bacik
2016-10-21 21:11 ` Dave Jones
2016-10-22 15:20 ` Dave Jones
2016-10-23 21:32 ` Chris Mason
2016-10-24 4:40 ` Dave Jones
2016-10-24 13:42 ` Chris Mason
2016-10-26 0:27 ` Dave Jones
2016-10-26 1:33 ` Linus Torvalds
2016-10-26 1:39 ` Linus Torvalds
2016-10-26 16:30 ` Dave Jones
2016-10-26 16:48 ` Linus Torvalds
2016-10-26 18:18 ` Dave Jones
2016-10-26 18:42 ` Dave Jones
2016-10-26 19:06 ` Linus Torvalds
2016-10-26 20:00 ` Chris Mason
2016-10-26 21:52 ` Chris Mason
2016-10-26 22:21 ` Linus Torvalds
2016-10-26 22:40 ` Dave Jones
2016-10-26 22:51 ` Linus Torvalds
2016-10-26 22:55 ` Jens Axboe
2016-10-26 22:58 ` Linus Torvalds
2016-10-26 23:03 ` Jens Axboe
2016-10-26 23:07 ` Dave Jones
2016-10-26 23:08 ` Linus Torvalds
2016-10-26 23:20 ` Jens Axboe
2016-10-26 23:38 ` Chris Mason
2016-10-26 23:47 ` Dave Jones
2016-10-27 0:00 ` Jens Axboe
2016-10-27 13:33 ` Chris Mason
2016-10-31 18:55 ` Dave Jones
2016-10-31 19:35 ` Linus Torvalds
2016-10-31 19:44 ` Chris Mason
2016-11-06 16:55 ` btrfs btree_ctree_super fault Dave Jones
2016-11-08 14:59 ` Dave Jones [this message]
2016-11-08 15:08 ` Chris Mason
2016-11-10 14:35 ` Dave Jones
2016-11-10 15:27 ` Chris Mason
2016-11-23 19:34 ` bio linked list corruption Dave Jones
2016-11-23 19:58 ` Dave Jones
2016-12-01 15:32 ` btrfs_destroy_inode warn (outstanding extents) Dave Jones
2016-12-03 16:48 ` Dave Jones
2016-12-07 16:15 ` Dave Jones
2016-12-09 21:12 ` Steven Rostedt
2016-12-04 23:04 ` bio linked list corruption Vegard Nossum
2016-12-05 11:10 ` Vegard Nossum
2016-12-05 17:09 ` Vegard Nossum
2016-12-05 17:21 ` Dave Jones
2016-12-05 17:55 ` Linus Torvalds
2016-12-05 19:11 ` Vegard Nossum
2016-12-05 20:10 ` Linus Torvalds
2016-12-05 20:35 ` Linus Torvalds
2016-12-05 21:33 ` Vegard Nossum
2016-12-06 8:42 ` Vegard Nossum
2016-12-06 8:16 ` Peter Zijlstra
2016-12-06 8:36 ` Ingo Molnar
2016-12-06 16:33 ` Linus Torvalds
2016-12-05 20:10 ` Vegard Nossum
2016-12-05 18:11 ` Andy Lutomirski
2016-12-05 18:25 ` Linus Torvalds
2016-12-05 18:26 ` Vegard Nossum
2016-10-26 23:19 ` Chris Mason
2016-10-26 23:21 ` Jens Axboe
2016-10-27 6:33 ` Christoph Hellwig
2016-10-27 16:34 ` Linus Torvalds
2016-10-27 16:36 ` Jens Axboe
2016-10-26 23:01 ` Dave Jones
2016-10-26 23:05 ` Jens Axboe
2016-10-26 22:52 ` Jens Axboe
2016-10-26 22:07 ` Linus Torvalds
2016-10-26 22:54 ` Chris Mason
2016-10-27 5:41 ` Dave Chinner
2016-10-27 17:23 ` Dave Jones
2016-10-24 20:06 ` Andy Lutomirski
2016-10-24 20:46 ` Linus Torvalds
2016-10-24 21:17 ` Linus Torvalds
2016-10-24 21:50 ` Linus Torvalds
2016-10-24 22:02 ` Chris Mason
2016-10-24 22:42 ` Andy Lutomirski
2016-10-25 0:00 ` Linus Torvalds
2016-10-25 1:09 ` Andy Lutomirski
2016-10-19 17:09 ` Philipp Hahn
2016-10-19 17:43 ` Linus Torvalds
2016-10-20 6:52 ` Ingo Molnar
2016-10-20 7:17 ` Thomas Gleixner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20161108145912.fcjvwxcpqgd7kjei@codemonkey.org.uk \
--to=davej@codemonkey.org.uk \
--cc=axboe@fb.com \
--cc=clm@fb.com \
--cc=david@fromorbit.com \
--cc=dsterba@suse.com \
--cc=jbacik@fb.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@amacapital.net \
--cc=luto@kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).