* general protection fault (btrfs_real_readdir)
@ 2016-05-18 11:31 Markus Trippelsdorf
2016-05-18 12:21 ` Al Viro
0 siblings, 1 reply; 6+ messages in thread
From: Markus Trippelsdorf @ 2016-05-18 11:31 UTC (permalink / raw)
To: Al Viro; +Cc: linux-kernel
I'm running the latest Linus git tree and the parallel filesystem directory
handling update seems to cause the following issue:
general protection fault: 0000 [#1] SMP
CPU: 0 PID: 24801 Comm: ld Not tainted 4.6.0-03623-g0b7962a6c4a3-dirty #118
Hardware name: System manufacturer System Product Name/M4A78T-E, BIOS 3503 04/13/2011
task: ffff88016cafa800 ti: ffff8801076e0000 task.ti: ffff8801076e0000
RIP: 0010:[<ffffffff8134e9f3>] [<ffffffff8134e9f3>] btrfs_readdir_delayed_dir_index+0x73/0x120
RSP: 0018:ffff8801076e3dc0 EFLAGS: 00010202
RAX: dead000000000100 RBX: ffff8800dbbd3840 RCX: ffff8800dbbd3880
RDX: dead000000000200 RSI: ffff8801076e3e30 RDI: ffff8801076e3ef0
RBP: ffff8801076e3e30 R08: ffff88006ac4b798 R09: ffff880000000000
R10: 0000160000000000 R11: 0000000000001000 R12: dead000000000200
R13: dead000000000100 R14: ffff8801076e3ef0 R15: dead0000000000c0
FS: 00007f4782e04740(0000) GS:ffff88021fc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f47821ef000 CR3: 00000001b9e38000 CR4: 00000000000006f0
Stack:
ffff8801076e3e2f 0000000000003000 0000000000000001 ffff88006ac4b6f8
ffff8801076e3e30 ffff8801076e3ef0 0000000000000000 0000000000000008
ffffffff812f038b 0000001281e62a80 ffff880214ea8800 ffff8801986fdbd0
Call Trace:
[<ffffffff812f038b>] ? btrfs_real_readdir+0x44b/0x540
[<ffffffff811b064d>] ? SyS_getdents+0x12d/0x2a0
[<ffffffff811affa0>] ? SyS_ioctl+0x6a0/0x6a0
[<ffffffff810923db>] ? entry_SYSCALL_64_fastpath+0x13/0x8f
Code: 02 00 00 00 00 ad de eb 1e f0 ff 4b 60 74 73 49 8b 47 40 49 8d 57 40 4c 89 fb 48 39 d5 4c 8d 78 c0 0f 84 8d 00 00 00 48 8b 53 48 <48> 89 50 08 48 89 02 4c 89 6b 40 4c 89 63 48 48 8b 4b 21 49 3b
RIP [<ffffffff8134e9f3>] btrfs_readdir_delayed_dir_index+0x73/0x120
RSP <ffff8801076e3dc0>
---[ end trace 91067801e8a68a7e ]---
This happened while I was building gcc, so the system was very busy.
--
Markus
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: general protection fault (btrfs_real_readdir)
2016-05-18 11:31 general protection fault (btrfs_real_readdir) Markus Trippelsdorf
@ 2016-05-18 12:21 ` Al Viro
2016-05-18 12:32 ` Markus Trippelsdorf
` (3 more replies)
0 siblings, 4 replies; 6+ messages in thread
From: Al Viro @ 2016-05-18 12:21 UTC (permalink / raw)
To: Markus Trippelsdorf; +Cc: linux-kernel, Chris Mason
On Wed, May 18, 2016 at 01:31:40PM +0200, Markus Trippelsdorf wrote:
> I'm running the latest Linus git tree and the parallel filesystem directory
> handling update seems to cause the following issue:
> Call Trace:
> [<ffffffff812f038b>] ? btrfs_real_readdir+0x44b/0x540
> [<ffffffff811b064d>] ? SyS_getdents+0x12d/0x2a0
> [<ffffffff811affa0>] ? SyS_ioctl+0x6a0/0x6a0
> [<ffffffff810923db>] ? entry_SYSCALL_64_fastpath+0x13/0x8f
> Code: 02 00 00 00 00 ad de eb 1e f0 ff 4b 60 74 73 49 8b 47 40 49 8d 57 40 4c 89 fb 48 39 d5 4c 8d 78 c0 0f 84 8d 00 00 00 48 8b 53 48 <48> 89 50 08 48 89 02 4c 89 6b 40 4c 89 63 48 48 8b 4b 21 49 3b
> RIP [<ffffffff8134e9f3>] btrfs_readdir_delayed_dir_index+0x73/0x120
> RSP <ffff8801076e3dc0>
> ---[ end trace 91067801e8a68a7e ]---
>
> This happened while I was building gcc, so the system was very busy.
>From a very superficial reading of delayed-inode.c, it looks like delayed
node might need locking... This
list_for_each_entry_safe(curr, next, ins_list, readdir_list) {
list_del(&curr->readdir_list);
looks particularly unpleasant. Just to make sure that this *is* just a
readdir issue (and not something involving lookups), could you try to
reproduce the breakage with 972b241f8 reverted?
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: general protection fault (btrfs_real_readdir)
2016-05-18 12:21 ` Al Viro
@ 2016-05-18 12:32 ` Markus Trippelsdorf
2016-05-18 13:01 ` Markus Trippelsdorf
` (2 subsequent siblings)
3 siblings, 0 replies; 6+ messages in thread
From: Markus Trippelsdorf @ 2016-05-18 12:32 UTC (permalink / raw)
To: Al Viro; +Cc: linux-kernel, Chris Mason
On 2016.05.18 at 13:21 +0100, Al Viro wrote:
> On Wed, May 18, 2016 at 01:31:40PM +0200, Markus Trippelsdorf wrote:
> > I'm running the latest Linus git tree and the parallel filesystem directory
> > handling update seems to cause the following issue:
>
> > Call Trace:
> > [<ffffffff812f038b>] ? btrfs_real_readdir+0x44b/0x540
> > [<ffffffff811b064d>] ? SyS_getdents+0x12d/0x2a0
> > [<ffffffff811affa0>] ? SyS_ioctl+0x6a0/0x6a0
> > [<ffffffff810923db>] ? entry_SYSCALL_64_fastpath+0x13/0x8f
> > Code: 02 00 00 00 00 ad de eb 1e f0 ff 4b 60 74 73 49 8b 47 40 49 8d 57 40 4c 89 fb 48 39 d5 4c 8d 78 c0 0f 84 8d 00 00 00 48 8b 53 48 <48> 89 50 08 48 89 02 4c 89 6b 40 4c 89 63 48 48 8b 4b 21 49 3b
> > RIP [<ffffffff8134e9f3>] btrfs_readdir_delayed_dir_index+0x73/0x120
> > RSP <ffff8801076e3dc0>
> > ---[ end trace 91067801e8a68a7e ]---
> >
> > This happened while I was building gcc, so the system was very busy.
>
> From a very superficial reading of delayed-inode.c, it looks like delayed
> node might need locking... This
> list_for_each_entry_safe(curr, next, ins_list, readdir_list) {
> list_del(&curr->readdir_list);
> looks particularly unpleasant. Just to make sure that this *is* just a
> readdir issue (and not something involving lookups), could you try to
> reproduce the breakage with 972b241f8 reverted?
Will give it a try later.
Just in case it may help:
(gdb) list *(btrfs_readdir_delayed_dir_index+0x73)
0xffffffff8134e9f3 is in btrfs_readdir_delayed_dir_index (include/linux/list.h:89).
84 * This is only for internal list manipulation where we know
85 * the prev/next entries already!
86 */
87 static inline void __list_del(struct list_head * prev, struct list_head * next)
88 {
89 next->prev = prev;
90 WRITE_ONCE(prev->next, next);
91 }
92
93 /**
(gdb) list *(btrfs_real_readdir+0x44b)
0xffffffff812f038b is in btrfs_real_readdir (fs/btrfs/inode.c:5858).
5853
5854 if (key_type == BTRFS_DIR_INDEX_KEY) {
5855 if (is_curr)
5856 ctx->pos++;
5857 ret = btrfs_readdir_delayed_dir_index(ctx, &ins_list, &emitted);
5858 if (ret)
5859 goto nopos;
5860 }
5861
5862 /*
--
Markus
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: general protection fault (btrfs_real_readdir)
2016-05-18 12:21 ` Al Viro
2016-05-18 12:32 ` Markus Trippelsdorf
@ 2016-05-18 13:01 ` Markus Trippelsdorf
2016-05-18 13:11 ` Chris Mason
2016-05-18 14:14 ` Chris Mason
3 siblings, 0 replies; 6+ messages in thread
From: Markus Trippelsdorf @ 2016-05-18 13:01 UTC (permalink / raw)
To: Al Viro; +Cc: linux-kernel, Chris Mason
On 2016.05.18 at 13:21 +0100, Al Viro wrote:
> On Wed, May 18, 2016 at 01:31:40PM +0200, Markus Trippelsdorf wrote:
> > I'm running the latest Linus git tree and the parallel filesystem directory
> > handling update seems to cause the following issue:
>
> > Call Trace:
> > [<ffffffff812f038b>] ? btrfs_real_readdir+0x44b/0x540
> > [<ffffffff811b064d>] ? SyS_getdents+0x12d/0x2a0
> > [<ffffffff811affa0>] ? SyS_ioctl+0x6a0/0x6a0
> > [<ffffffff810923db>] ? entry_SYSCALL_64_fastpath+0x13/0x8f
> > Code: 02 00 00 00 00 ad de eb 1e f0 ff 4b 60 74 73 49 8b 47 40 49 8d 57 40 4c 89 fb 48 39 d5 4c 8d 78 c0 0f 84 8d 00 00 00 48 8b 53 48 <48> 89 50 08 48 89 02 4c 89 6b 40 4c 89 63 48 48 8b 4b 21 49 3b
> > RIP [<ffffffff8134e9f3>] btrfs_readdir_delayed_dir_index+0x73/0x120
> > RSP <ffff8801076e3dc0>
> > ---[ end trace 91067801e8a68a7e ]---
> >
> > This happened while I was building gcc, so the system was very busy.
>
> From a very superficial reading of delayed-inode.c, it looks like delayed
> node might need locking... This
> list_for_each_entry_safe(curr, next, ins_list, readdir_list) {
> list_del(&curr->readdir_list);
> looks particularly unpleasant. Just to make sure that this *is* just a
> readdir issue (and not something involving lookups), could you try to
> reproduce the breakage with 972b241f8 reverted?
For what it's worth, gcc bootstrapped fine with 972b241f8 reverted.
--
Markus
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: general protection fault (btrfs_real_readdir)
2016-05-18 12:21 ` Al Viro
2016-05-18 12:32 ` Markus Trippelsdorf
2016-05-18 13:01 ` Markus Trippelsdorf
@ 2016-05-18 13:11 ` Chris Mason
2016-05-18 14:14 ` Chris Mason
3 siblings, 0 replies; 6+ messages in thread
From: Chris Mason @ 2016-05-18 13:11 UTC (permalink / raw)
To: Al Viro; +Cc: Markus Trippelsdorf, linux-kernel
On Wed, May 18, 2016 at 01:21:14PM +0100, Al Viro wrote:
> On Wed, May 18, 2016 at 01:31:40PM +0200, Markus Trippelsdorf wrote:
> > I'm running the latest Linus git tree and the parallel filesystem directory
> > handling update seems to cause the following issue:
>
> > Call Trace:
> > [<ffffffff812f038b>] ? btrfs_real_readdir+0x44b/0x540
> > [<ffffffff811b064d>] ? SyS_getdents+0x12d/0x2a0
> > [<ffffffff811affa0>] ? SyS_ioctl+0x6a0/0x6a0
> > [<ffffffff810923db>] ? entry_SYSCALL_64_fastpath+0x13/0x8f
> > Code: 02 00 00 00 00 ad de eb 1e f0 ff 4b 60 74 73 49 8b 47 40 49 8d 57 40 4c 89 fb 48 39 d5 4c 8d 78 c0 0f 84 8d 00 00 00 48 8b 53 48 <48> 89 50 08 48 89 02 4c 89 6b 40 4c 89 63 48 48 8b 4b 21 49 3b
> > RIP [<ffffffff8134e9f3>] btrfs_readdir_delayed_dir_index+0x73/0x120
> > RSP <ffff8801076e3dc0>
> > ---[ end trace 91067801e8a68a7e ]---
> >
> > This happened while I was building gcc, so the system was very busy.
>
> From a very superficial reading of delayed-inode.c, it looks like delayed
> node might need locking... This
> list_for_each_entry_safe(curr, next, ins_list, readdir_list) {
> list_del(&curr->readdir_list);
> looks particularly unpleasant. Just to make sure that this *is* just a
> readdir issue (and not something involving lookups), could you try to
> reproduce the breakage with 972b241f8 reverted?
Yeah, it does expect the mutex to be held, sorry Al I missed this when
you asked. I'll cook a patch today.
-chris
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: general protection fault (btrfs_real_readdir)
2016-05-18 12:21 ` Al Viro
` (2 preceding siblings ...)
2016-05-18 13:11 ` Chris Mason
@ 2016-05-18 14:14 ` Chris Mason
3 siblings, 0 replies; 6+ messages in thread
From: Chris Mason @ 2016-05-18 14:14 UTC (permalink / raw)
To: Al Viro; +Cc: Markus Trippelsdorf, linux-kernel
On Wed, May 18, 2016 at 01:21:14PM +0100, Al Viro wrote:
> On Wed, May 18, 2016 at 01:31:40PM +0200, Markus Trippelsdorf wrote:
> > I'm running the latest Linus git tree and the parallel filesystem directory
> > handling update seems to cause the following issue:
>
> > Call Trace:
> > [<ffffffff812f038b>] ? btrfs_real_readdir+0x44b/0x540
> > [<ffffffff811b064d>] ? SyS_getdents+0x12d/0x2a0
> > [<ffffffff811affa0>] ? SyS_ioctl+0x6a0/0x6a0
> > [<ffffffff810923db>] ? entry_SYSCALL_64_fastpath+0x13/0x8f
> > Code: 02 00 00 00 00 ad de eb 1e f0 ff 4b 60 74 73 49 8b 47 40 49 8d 57 40 4c 89 fb 48 39 d5 4c 8d 78 c0 0f 84 8d 00 00 00 48 8b 53 48 <48> 89 50 08 48 89 02 4c 89 6b 40 4c 89 63 48 48 8b 4b 21 49 3b
> > RIP [<ffffffff8134e9f3>] btrfs_readdir_delayed_dir_index+0x73/0x120
> > RSP <ffff8801076e3dc0>
> > ---[ end trace 91067801e8a68a7e ]---
> >
> > This happened while I was building gcc, so the system was very busy.
>
> From a very superficial reading of delayed-inode.c, it looks like delayed
> node might need locking... This
> list_for_each_entry_safe(curr, next, ins_list, readdir_list) {
> list_del(&curr->readdir_list);
> looks particularly unpleasant. Just to make sure that this *is* just a
> readdir issue (and not something involving lookups), could you try to
> reproduce the breakage with 972b241f8 reverted?
Ok, really fixing this means redoing the delayed directory walking
completely. I'm not comfortable with that for a one day fix, lets just
revert 972b241f8 while I bash this out.
-chris
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2016-05-18 14:15 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-18 11:31 general protection fault (btrfs_real_readdir) Markus Trippelsdorf
2016-05-18 12:21 ` Al Viro
2016-05-18 12:32 ` Markus Trippelsdorf
2016-05-18 13:01 ` Markus Trippelsdorf
2016-05-18 13:11 ` Chris Mason
2016-05-18 14:14 ` Chris Mason
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).