All of lore.kernel.org
 help / color / mirror / Atom feed
* kernel BUG in fs/dcache.c running nfs
@ 2014-09-08 14:45 Christoph Hellwig
  2014-09-09 14:59 ` Jeff Layton
  0 siblings, 1 reply; 13+ messages in thread
From: Christoph Hellwig @ 2014-09-08 14:45 UTC (permalink / raw)
  To: linux-nfs

I've reproduced a BUG in dcache.c twice now when running xfstests, once
without the "nfs4: queue free_lock_state job submission to nfsiod" and
once with that revert in.

Both runs are using v4.1 against a Linux server with XFS on the same
host.

generic/089 199s ...
[ 5497.402293] ------------[ cut here ]------------
[ 5497.403150] kernel BUG at ../fs/dcache.c:1620!
[ 5497.403974] invalid opcode: 0000 [#1] SMP 
[ 5497.404826] Modules linked in:
[ 5497.405280] CPU: 1 PID: 14691 Comm: t_mtab Not tainted 3.17.0-rc3+ #264
[ 5497.405585] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[ 5497.405585] task: ffff88007ac801d0 ti: ffff8800670a4000 task.ti: ffff8800670a4000
[ 5497.405585] RIP: 0010:[<ffffffff811d1345>]  [<ffffffff811d1345>] d_instantiate+0x75/0x80
[ 5497.405585] RSP: 0018:ffff8800670a7a68  EFLAGS: 00010286
[ 5497.405585] RAX: 0000000000000001 RBX: ffff880066c399d8 RCX: ffff88007ac80990
[ 5497.405585] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880066c399d8
[ 5497.405585] RBP: ffff8800670a7a88 R08: 0000000000000001 R09: 0000000000000000
[ 5497.405585] R10: ffff880072f45000 R11: 000000000003fdf0 R12: ffff880066c399d8
[ 5497.405585] R13: ffff88007a684800 R14: ffff88007acbc280 R15: ffff8800670de000
[ 5497.405585] FS:  00007f6db6aae700(0000) GS:ffff88007fc80000(0000) knlGS:0000000000000000
[ 5497.405585] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 5497.405585] CR2: 00007f6db5f56800 CR3: 000000007ac9e000 CR4: 00000000000006e0
[ 5497.405585] Stack:
[ 5497.405585]  ffff8800670a7a88 ffff880066c399d8 ffff88007a684800 ffff88007a684800
[ 5497.405585]  ffff8800670a7b68 ffffffff8135d0c1 ffffffff00000004 ffff8800000000d0
[ 5497.405585]  ffff88007d400180 0000000000000246 ffff8800fffffffe ffff880072f45000
[ 5497.405585] Call Trace:
[ 5497.405585]  [<ffffffff8135d0c1>] nfs4_do_open.constprop.84+0x681/0x960
[ 5497.405585]  [<ffffffff8135d459>] nfs4_atomic_open+0x9/0x20
[ 5497.405585]  [<ffffffff8136cc3d>] nfs4_file_open+0xcd/0x1b0
[ 5497.405585]  [<ffffffff81d9ee56>] ? _raw_spin_unlock+0x26/0x30
[ 5497.405585]  [<ffffffff8176b73d>] ? lockref_get+0x1d/0x30
[ 5497.405585]  [<ffffffff8136cb70>] ? nfs4_file_fsync+0xb0/0xb0
[ 5497.405585]  [<ffffffff811b8092>] do_dentry_open.isra.13+0x1f2/0x320
[ 5497.405585]  [<ffffffff81336f52>] ? nfs_permission+0x62/0x1d0
[ 5497.405585]  [<ffffffff811b82ad>] finish_open+0x1d/0x30
[ 5497.405585]  [<ffffffff811c91be>] do_last.isra.63+0x62e/0xca0
[ 5497.405585]  [<ffffffff811c5703>] ? inode_permission+0x13/0x50
[ 5497.405585]  [<ffffffff811c5bde>] ? link_path_walk+0x23e/0x850
[ 5497.405585]  [<ffffffff811c98e9>] path_openat+0xb9/0x670
[ 5497.405585]  [<ffffffff811b366b>] ? poison_obj+0x2b/0x40
[ 5497.405585]  [<ffffffff811ca26e>] do_filp_open+0x3e/0xa0
[ 5497.405585]  [<ffffffff811d7da1>] ? __alloc_fd+0xd1/0x120
[ 5497.405585]  [<ffffffff811b95cc>] do_sys_open+0x13c/0x230
[ 5497.405585]  [<ffffffff810f5bbd>] ? trace_hardirqs_on_caller+0x10d/0x1d0
[ 5497.405585]  [<ffffffff811b96dd>] SyS_open+0x1d/0x20
[ 5497.405585]  [<ffffffff81d9f5a9>] system_call_fastpath+0x16/0x1b


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: kernel BUG in fs/dcache.c running nfs
  2014-09-08 14:45 kernel BUG in fs/dcache.c running nfs Christoph Hellwig
@ 2014-09-09 14:59 ` Jeff Layton
  2014-09-09 15:42   ` Christoph Hellwig
  2014-09-09 16:59   ` Trond Myklebust
  0 siblings, 2 replies; 13+ messages in thread
From: Jeff Layton @ 2014-09-09 14:59 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-nfs

On Mon, 8 Sep 2014 07:45:25 -0700
Christoph Hellwig <hch@infradead.org> wrote:

> I've reproduced a BUG in dcache.c twice now when running xfstests, once
> without the "nfs4: queue free_lock_state job submission to nfsiod" and
> once with that revert in.
> 
> Both runs are using v4.1 against a Linux server with XFS on the same
> host.
> 
> generic/089 199s ...
> [ 5497.402293] ------------[ cut here ]------------
> [ 5497.403150] kernel BUG at ../fs/dcache.c:1620!
> [ 5497.403974] invalid opcode: 0000 [#1] SMP 
> [ 5497.404826] Modules linked in:
> [ 5497.405280] CPU: 1 PID: 14691 Comm: t_mtab Not tainted 3.17.0-rc3+ #264
> [ 5497.405585] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
> [ 5497.405585] task: ffff88007ac801d0 ti: ffff8800670a4000 task.ti: ffff8800670a4000
> [ 5497.405585] RIP: 0010:[<ffffffff811d1345>]  [<ffffffff811d1345>] d_instantiate+0x75/0x80
> [ 5497.405585] RSP: 0018:ffff8800670a7a68  EFLAGS: 00010286
> [ 5497.405585] RAX: 0000000000000001 RBX: ffff880066c399d8 RCX: ffff88007ac80990
> [ 5497.405585] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880066c399d8
> [ 5497.405585] RBP: ffff8800670a7a88 R08: 0000000000000001 R09: 0000000000000000
> [ 5497.405585] R10: ffff880072f45000 R11: 000000000003fdf0 R12: ffff880066c399d8
> [ 5497.405585] R13: ffff88007a684800 R14: ffff88007acbc280 R15: ffff8800670de000
> [ 5497.405585] FS:  00007f6db6aae700(0000) GS:ffff88007fc80000(0000) knlGS:0000000000000000
> [ 5497.405585] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [ 5497.405585] CR2: 00007f6db5f56800 CR3: 000000007ac9e000 CR4: 00000000000006e0
> [ 5497.405585] Stack:
> [ 5497.405585]  ffff8800670a7a88 ffff880066c399d8 ffff88007a684800 ffff88007a684800
> [ 5497.405585]  ffff8800670a7b68 ffffffff8135d0c1 ffffffff00000004 ffff8800000000d0
> [ 5497.405585]  ffff88007d400180 0000000000000246 ffff8800fffffffe ffff880072f45000
> [ 5497.405585] Call Trace:
> [ 5497.405585]  [<ffffffff8135d0c1>] nfs4_do_open.constprop.84+0x681/0x960
> [ 5497.405585]  [<ffffffff8135d459>] nfs4_atomic_open+0x9/0x20
> [ 5497.405585]  [<ffffffff8136cc3d>] nfs4_file_open+0xcd/0x1b0
> [ 5497.405585]  [<ffffffff81d9ee56>] ? _raw_spin_unlock+0x26/0x30
> [ 5497.405585]  [<ffffffff8176b73d>] ? lockref_get+0x1d/0x30
> [ 5497.405585]  [<ffffffff8136cb70>] ? nfs4_file_fsync+0xb0/0xb0
> [ 5497.405585]  [<ffffffff811b8092>] do_dentry_open.isra.13+0x1f2/0x320
> [ 5497.405585]  [<ffffffff81336f52>] ? nfs_permission+0x62/0x1d0
> [ 5497.405585]  [<ffffffff811b82ad>] finish_open+0x1d/0x30
> [ 5497.405585]  [<ffffffff811c91be>] do_last.isra.63+0x62e/0xca0
> [ 5497.405585]  [<ffffffff811c5703>] ? inode_permission+0x13/0x50
> [ 5497.405585]  [<ffffffff811c5bde>] ? link_path_walk+0x23e/0x850
> [ 5497.405585]  [<ffffffff811c98e9>] path_openat+0xb9/0x670
> [ 5497.405585]  [<ffffffff811b366b>] ? poison_obj+0x2b/0x40
> [ 5497.405585]  [<ffffffff811ca26e>] do_filp_open+0x3e/0xa0
> [ 5497.405585]  [<ffffffff811d7da1>] ? __alloc_fd+0xd1/0x120
> [ 5497.405585]  [<ffffffff811b95cc>] do_sys_open+0x13c/0x230
> [ 5497.405585]  [<ffffffff810f5bbd>] ? trace_hardirqs_on_caller+0x10d/0x1d0
> [ 5497.405585]  [<ffffffff811b96dd>] SyS_open+0x1d/0x20
> [ 5497.405585]  [<ffffffff81d9f5a9>] system_call_fastpath+0x16/0x1b
> 

Odd...

It looks like you hit the BUG() in d_instantiate.

I don't see any calls to d_instantiate in the current nfs_code (aside
from the one in nfs_lookup, and I don't think that's getting called
here). d_instantiate is called by d_add though, and that gets called
from nfs_atomic_open. Is that what happened here?

Maybe you can use gdb to figure out what line of code
"nfs4_do_open.constprop.84+0x681" actually is?

-- 
Jeff Layton <jlayton@primarydata.com>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: kernel BUG in fs/dcache.c running nfs
  2014-09-09 14:59 ` Jeff Layton
@ 2014-09-09 15:42   ` Christoph Hellwig
  2014-09-09 16:12     ` Jeff Layton
  2014-09-09 16:59   ` Trond Myklebust
  1 sibling, 1 reply; 13+ messages in thread
From: Christoph Hellwig @ 2014-09-09 15:42 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Christoph Hellwig, linux-nfs

On Tue, Sep 09, 2014 at 10:59:18AM -0400, Jeff Layton wrote:
> > [ 5497.405585]  [<ffffffff8135d0c1>] nfs4_do_open.constprop.84+0x681/0x960
> > [ 5497.405585]  [<ffffffff8135d459>] nfs4_atomic_open+0x9/0x20
> > [ 5497.405585]  [<ffffffff8136cc3d>] nfs4_file_open+0xcd/0x1b0
> > [ 5497.405585]  [<ffffffff811b8092>] do_dentry_open.isra.13+0x1f2/0x320
> > [ 5497.405585]  [<ffffffff811b82ad>] finish_open+0x1d/0x30
> > [ 5497.405585]  [<ffffffff811c98e9>] path_openat+0xb9/0x670
> > [ 5497.405585]  [<ffffffff811ca26e>] do_filp_open+0x3e/0xa0
> > [ 5497.405585]  [<ffffffff811b95cc>] do_sys_open+0x13c/0x230
> > [ 5497.405585]  [<ffffffff811b96dd>] SyS_open+0x1d/0x20
> > [ 5497.405585]  [<ffffffff81d9f5a9>] system_call_fastpath+0x16/0x1b
> > 
> 
> Odd...
> 
> It looks like you hit the BUG() in d_instantiate.
> 
> I don't see any calls to d_instantiate in the current nfs_code (aside
> from the one in nfs_lookup, and I don't think that's getting called
> here). d_instantiate is called by d_add though, and that gets called
> from nfs_atomic_open. Is that what happened here?
> 
> Maybe you can use gdb to figure out what line of code
> "nfs4_do_open.constprop.84+0x681" actually is?

My gdb can't cope with these constprop expressions unfortunately.

But when you remove the questionable symbols as I've done above it's
pretty clear that this must be the

nfs4_atomic_open
 -> nfs4_do_open
   -> _nfs4_do_open
     -> _nfs4_open_and_get_state
       -> d_add
         -> d_instantiate

call chain. There is heavy inlining going on inside nfs4file.c, and
d_add now is a simple inline around d_instantiate and d_rehash.
> 
> -- 
> Jeff Layton <jlayton@primarydata.com>
---end quoted text---

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: kernel BUG in fs/dcache.c running nfs
  2014-09-09 15:42   ` Christoph Hellwig
@ 2014-09-09 16:12     ` Jeff Layton
  2014-09-09 16:15       ` Jeff Layton
  0 siblings, 1 reply; 13+ messages in thread
From: Jeff Layton @ 2014-09-09 16:12 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Jeff Layton, linux-nfs

On Tue, 9 Sep 2014 08:42:11 -0700
Christoph Hellwig <hch@infradead.org> wrote:

> On Tue, Sep 09, 2014 at 10:59:18AM -0400, Jeff Layton wrote:
> > > [ 5497.405585]  [<ffffffff8135d0c1>] nfs4_do_open.constprop.84+0x681/0x960
> > > [ 5497.405585]  [<ffffffff8135d459>] nfs4_atomic_open+0x9/0x20
> > > [ 5497.405585]  [<ffffffff8136cc3d>] nfs4_file_open+0xcd/0x1b0
> > > [ 5497.405585]  [<ffffffff811b8092>] do_dentry_open.isra.13+0x1f2/0x320
> > > [ 5497.405585]  [<ffffffff811b82ad>] finish_open+0x1d/0x30
> > > [ 5497.405585]  [<ffffffff811c98e9>] path_openat+0xb9/0x670
> > > [ 5497.405585]  [<ffffffff811ca26e>] do_filp_open+0x3e/0xa0
> > > [ 5497.405585]  [<ffffffff811b95cc>] do_sys_open+0x13c/0x230
> > > [ 5497.405585]  [<ffffffff811b96dd>] SyS_open+0x1d/0x20
> > > [ 5497.405585]  [<ffffffff81d9f5a9>] system_call_fastpath+0x16/0x1b
> > > 
> > 
> > Odd...
> > 
> > It looks like you hit the BUG() in d_instantiate.
> > 
> > I don't see any calls to d_instantiate in the current nfs_code (aside
> > from the one in nfs_lookup, and I don't think that's getting called
> > here). d_instantiate is called by d_add though, and that gets called
> > from nfs_atomic_open. Is that what happened here?
> > 
> > Maybe you can use gdb to figure out what line of code
> > "nfs4_do_open.constprop.84+0x681" actually is?
> 
> My gdb can't cope with these constprop expressions unfortunately.
> 
> But when you remove the questionable symbols as I've done above it's
> pretty clear that this must be the
> 
> nfs4_atomic_open
>  -> nfs4_do_open
>    -> _nfs4_do_open
>      -> _nfs4_open_and_get_state
>        -> d_add
>          -> d_instantiate
> 
> call chain. There is heavy inlining going on inside nfs4file.c, and
> d_add now is a simple inline around d_instantiate and d_rehash.

Ok. So I'm guessing that means that the current scheme of doing a
d_drop/d_add is not valid. d_drop doesn't remove the dentry from the
i_alias list.

It looks like the first call there should just be doing a d_delete
instead, since it's trying to turn the thing into a negative dentry.
-- 
Jeff Layton <jlayton@primarydata.com>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: kernel BUG in fs/dcache.c running nfs
  2014-09-09 16:12     ` Jeff Layton
@ 2014-09-09 16:15       ` Jeff Layton
  2014-09-09 17:50         ` Jeff Layton
  0 siblings, 1 reply; 13+ messages in thread
From: Jeff Layton @ 2014-09-09 16:15 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Jeff Layton, linux-nfs, neilb

On Tue, 9 Sep 2014 12:12:44 -0400
Jeff Layton <jlayton@primarydata.com> wrote:

> On Tue, 9 Sep 2014 08:42:11 -0700
> Christoph Hellwig <hch@infradead.org> wrote:
> 
> > On Tue, Sep 09, 2014 at 10:59:18AM -0400, Jeff Layton wrote:
> > > > [ 5497.405585]  [<ffffffff8135d0c1>] nfs4_do_open.constprop.84+0x681/0x960
> > > > [ 5497.405585]  [<ffffffff8135d459>] nfs4_atomic_open+0x9/0x20
> > > > [ 5497.405585]  [<ffffffff8136cc3d>] nfs4_file_open+0xcd/0x1b0
> > > > [ 5497.405585]  [<ffffffff811b8092>] do_dentry_open.isra.13+0x1f2/0x320
> > > > [ 5497.405585]  [<ffffffff811b82ad>] finish_open+0x1d/0x30
> > > > [ 5497.405585]  [<ffffffff811c98e9>] path_openat+0xb9/0x670
> > > > [ 5497.405585]  [<ffffffff811ca26e>] do_filp_open+0x3e/0xa0
> > > > [ 5497.405585]  [<ffffffff811b95cc>] do_sys_open+0x13c/0x230
> > > > [ 5497.405585]  [<ffffffff811b96dd>] SyS_open+0x1d/0x20
> > > > [ 5497.405585]  [<ffffffff81d9f5a9>] system_call_fastpath+0x16/0x1b
> > > > 
> > > 
> > > Odd...
> > > 
> > > It looks like you hit the BUG() in d_instantiate.
> > > 
> > > I don't see any calls to d_instantiate in the current nfs_code (aside
> > > from the one in nfs_lookup, and I don't think that's getting called
> > > here). d_instantiate is called by d_add though, and that gets called
> > > from nfs_atomic_open. Is that what happened here?
> > > 
> > > Maybe you can use gdb to figure out what line of code
> > > "nfs4_do_open.constprop.84+0x681" actually is?
> > 
> > My gdb can't cope with these constprop expressions unfortunately.
> > 
> > But when you remove the questionable symbols as I've done above it's
> > pretty clear that this must be the
> > 
> > nfs4_atomic_open
> >  -> nfs4_do_open
> >    -> _nfs4_do_open
> >      -> _nfs4_open_and_get_state
> >        -> d_add
> >          -> d_instantiate
> > 
> > call chain. There is heavy inlining going on inside nfs4file.c, and
> > d_add now is a simple inline around d_instantiate and d_rehash.
> 
> Ok. So I'm guessing that means that the current scheme of doing a
> d_drop/d_add is not valid. d_drop doesn't remove the dentry from the
> i_alias list.
> 
> It looks like the first call there should just be doing a d_delete
> instead, since it's trying to turn the thing into a negative dentry.

(cc'ing Neil B.)

...and I'd hazard a guess that 4fa2c54b5198d might be the culprit. You
might want to try backing that out and see if it's still reproducible.

Neil, any thoughts?

-- 
Jeff Layton <jlayton@primarydata.com>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: kernel BUG in fs/dcache.c running nfs
  2014-09-09 14:59 ` Jeff Layton
  2014-09-09 15:42   ` Christoph Hellwig
@ 2014-09-09 16:59   ` Trond Myklebust
  2014-09-09 17:28     ` Jeff Layton
  1 sibling, 1 reply; 13+ messages in thread
From: Trond Myklebust @ 2014-09-09 16:59 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Christoph Hellwig, Linux NFS Mailing List

On Tue, Sep 9, 2014 at 7:59 AM, Jeff Layton <jeff.layton@primarydata.com> wrote:
> On Mon, 8 Sep 2014 07:45:25 -0700
> Christoph Hellwig <hch@infradead.org> wrote:
>
>> I've reproduced a BUG in dcache.c twice now when running xfstests, once
>> without the "nfs4: queue free_lock_state job submission to nfsiod" and
>> once with that revert in.
>>
>> Both runs are using v4.1 against a Linux server with XFS on the same
>> host.
>>
>> generic/089 199s ...
>> [ 5497.402293] ------------[ cut here ]------------
>> [ 5497.403150] kernel BUG at ../fs/dcache.c:1620!
>> [ 5497.403974] invalid opcode: 0000 [#1] SMP
>> [ 5497.404826] Modules linked in:
>> [ 5497.405280] CPU: 1 PID: 14691 Comm: t_mtab Not tainted 3.17.0-rc3+ #264
>> [ 5497.405585] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
>> [ 5497.405585] task: ffff88007ac801d0 ti: ffff8800670a4000 task.ti: ffff8800670a4000
>> [ 5497.405585] RIP: 0010:[<ffffffff811d1345>]  [<ffffffff811d1345>] d_instantiate+0x75/0x80
>> [ 5497.405585] RSP: 0018:ffff8800670a7a68  EFLAGS: 00010286
>> [ 5497.405585] RAX: 0000000000000001 RBX: ffff880066c399d8 RCX: ffff88007ac80990
>> [ 5497.405585] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880066c399d8
>> [ 5497.405585] RBP: ffff8800670a7a88 R08: 0000000000000001 R09: 0000000000000000
>> [ 5497.405585] R10: ffff880072f45000 R11: 000000000003fdf0 R12: ffff880066c399d8
>> [ 5497.405585] R13: ffff88007a684800 R14: ffff88007acbc280 R15: ffff8800670de000
>> [ 5497.405585] FS:  00007f6db6aae700(0000) GS:ffff88007fc80000(0000) knlGS:0000000000000000
>> [ 5497.405585] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>> [ 5497.405585] CR2: 00007f6db5f56800 CR3: 000000007ac9e000 CR4: 00000000000006e0
>> [ 5497.405585] Stack:
>> [ 5497.405585]  ffff8800670a7a88 ffff880066c399d8 ffff88007a684800 ffff88007a684800
>> [ 5497.405585]  ffff8800670a7b68 ffffffff8135d0c1 ffffffff00000004 ffff8800000000d0
>> [ 5497.405585]  ffff88007d400180 0000000000000246 ffff8800fffffffe ffff880072f45000
>> [ 5497.405585] Call Trace:
>> [ 5497.405585]  [<ffffffff8135d0c1>] nfs4_do_open.constprop.84+0x681/0x960
>> [ 5497.405585]  [<ffffffff8135d459>] nfs4_atomic_open+0x9/0x20
>> [ 5497.405585]  [<ffffffff8136cc3d>] nfs4_file_open+0xcd/0x1b0
>> [ 5497.405585]  [<ffffffff81d9ee56>] ? _raw_spin_unlock+0x26/0x30
>> [ 5497.405585]  [<ffffffff8176b73d>] ? lockref_get+0x1d/0x30
>> [ 5497.405585]  [<ffffffff8136cb70>] ? nfs4_file_fsync+0xb0/0xb0
>> [ 5497.405585]  [<ffffffff811b8092>] do_dentry_open.isra.13+0x1f2/0x320
>> [ 5497.405585]  [<ffffffff81336f52>] ? nfs_permission+0x62/0x1d0
>> [ 5497.405585]  [<ffffffff811b82ad>] finish_open+0x1d/0x30
>> [ 5497.405585]  [<ffffffff811c91be>] do_last.isra.63+0x62e/0xca0
>> [ 5497.405585]  [<ffffffff811c5703>] ? inode_permission+0x13/0x50
>> [ 5497.405585]  [<ffffffff811c5bde>] ? link_path_walk+0x23e/0x850
>> [ 5497.405585]  [<ffffffff811c98e9>] path_openat+0xb9/0x670
>> [ 5497.405585]  [<ffffffff811b366b>] ? poison_obj+0x2b/0x40
>> [ 5497.405585]  [<ffffffff811ca26e>] do_filp_open+0x3e/0xa0
>> [ 5497.405585]  [<ffffffff811d7da1>] ? __alloc_fd+0xd1/0x120
>> [ 5497.405585]  [<ffffffff811b95cc>] do_sys_open+0x13c/0x230
>> [ 5497.405585]  [<ffffffff810f5bbd>] ? trace_hardirqs_on_caller+0x10d/0x1d0
>> [ 5497.405585]  [<ffffffff811b96dd>] SyS_open+0x1d/0x20
>> [ 5497.405585]  [<ffffffff81d9f5a9>] system_call_fastpath+0x16/0x1b
>>
>
> Odd...
>
> It looks like you hit the BUG() in d_instantiate.
>
> I don't see any calls to d_instantiate in the current nfs_code (aside
> from the one in nfs_lookup, and I don't think that's getting called
> here). d_instantiate is called by d_add though, and that gets called
> from nfs_atomic_open. Is that what happened here?
>
> Maybe you can use gdb to figure out what line of code
> "nfs4_do_open.constprop.84+0x681" actually is?
>

Hi Jeff,

I took a look at this last night. I suspect that the problem is with
error handling.

So the problem is that in the current code, we can end up attaching
the dentry to the dcache using d_add() in _nfs4_open_and_get_state. I
suspect that we're hitting an error _after_ that call, that is causing
the callers of _nfs4_do_open() to loop and retry the open (and thus
call d_add a second time).
I haven't had time to investigate where this might be happening, but
feel free to look...

Cheers
  Trond

-- 
Trond Myklebust

Linux NFS client maintainer, PrimaryData

trond.myklebust@primarydata.com

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: kernel BUG in fs/dcache.c running nfs
  2014-09-09 16:59   ` Trond Myklebust
@ 2014-09-09 17:28     ` Jeff Layton
  0 siblings, 0 replies; 13+ messages in thread
From: Jeff Layton @ 2014-09-09 17:28 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Jeff Layton, Christoph Hellwig, Linux NFS Mailing List, neilb

On Tue, 9 Sep 2014 09:59:33 -0700
Trond Myklebust <trond.myklebust@primarydata.com> wrote:

> On Tue, Sep 9, 2014 at 7:59 AM, Jeff Layton <jeff.layton@primarydata.com> wrote:
> > On Mon, 8 Sep 2014 07:45:25 -0700
> > Christoph Hellwig <hch@infradead.org> wrote:
> >
> >> I've reproduced a BUG in dcache.c twice now when running xfstests, once
> >> without the "nfs4: queue free_lock_state job submission to nfsiod" and
> >> once with that revert in.
> >>
> >> Both runs are using v4.1 against a Linux server with XFS on the same
> >> host.
> >>
> >> generic/089 199s ...
> >> [ 5497.402293] ------------[ cut here ]------------
> >> [ 5497.403150] kernel BUG at ../fs/dcache.c:1620!
> >> [ 5497.403974] invalid opcode: 0000 [#1] SMP
> >> [ 5497.404826] Modules linked in:
> >> [ 5497.405280] CPU: 1 PID: 14691 Comm: t_mtab Not tainted 3.17.0-rc3+ #264
> >> [ 5497.405585] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
> >> [ 5497.405585] task: ffff88007ac801d0 ti: ffff8800670a4000 task.ti: ffff8800670a4000
> >> [ 5497.405585] RIP: 0010:[<ffffffff811d1345>]  [<ffffffff811d1345>] d_instantiate+0x75/0x80
> >> [ 5497.405585] RSP: 0018:ffff8800670a7a68  EFLAGS: 00010286
> >> [ 5497.405585] RAX: 0000000000000001 RBX: ffff880066c399d8 RCX: ffff88007ac80990
> >> [ 5497.405585] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880066c399d8
> >> [ 5497.405585] RBP: ffff8800670a7a88 R08: 0000000000000001 R09: 0000000000000000
> >> [ 5497.405585] R10: ffff880072f45000 R11: 000000000003fdf0 R12: ffff880066c399d8
> >> [ 5497.405585] R13: ffff88007a684800 R14: ffff88007acbc280 R15: ffff8800670de000
> >> [ 5497.405585] FS:  00007f6db6aae700(0000) GS:ffff88007fc80000(0000) knlGS:0000000000000000
> >> [ 5497.405585] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> >> [ 5497.405585] CR2: 00007f6db5f56800 CR3: 000000007ac9e000 CR4: 00000000000006e0
> >> [ 5497.405585] Stack:
> >> [ 5497.405585]  ffff8800670a7a88 ffff880066c399d8 ffff88007a684800 ffff88007a684800
> >> [ 5497.405585]  ffff8800670a7b68 ffffffff8135d0c1 ffffffff00000004 ffff8800000000d0
> >> [ 5497.405585]  ffff88007d400180 0000000000000246 ffff8800fffffffe ffff880072f45000
> >> [ 5497.405585] Call Trace:
> >> [ 5497.405585]  [<ffffffff8135d0c1>] nfs4_do_open.constprop.84+0x681/0x960
> >> [ 5497.405585]  [<ffffffff8135d459>] nfs4_atomic_open+0x9/0x20
> >> [ 5497.405585]  [<ffffffff8136cc3d>] nfs4_file_open+0xcd/0x1b0
> >> [ 5497.405585]  [<ffffffff81d9ee56>] ? _raw_spin_unlock+0x26/0x30
> >> [ 5497.405585]  [<ffffffff8176b73d>] ? lockref_get+0x1d/0x30
> >> [ 5497.405585]  [<ffffffff8136cb70>] ? nfs4_file_fsync+0xb0/0xb0
> >> [ 5497.405585]  [<ffffffff811b8092>] do_dentry_open.isra.13+0x1f2/0x320
> >> [ 5497.405585]  [<ffffffff81336f52>] ? nfs_permission+0x62/0x1d0
> >> [ 5497.405585]  [<ffffffff811b82ad>] finish_open+0x1d/0x30
> >> [ 5497.405585]  [<ffffffff811c91be>] do_last.isra.63+0x62e/0xca0
> >> [ 5497.405585]  [<ffffffff811c5703>] ? inode_permission+0x13/0x50
> >> [ 5497.405585]  [<ffffffff811c5bde>] ? link_path_walk+0x23e/0x850
> >> [ 5497.405585]  [<ffffffff811c98e9>] path_openat+0xb9/0x670
> >> [ 5497.405585]  [<ffffffff811b366b>] ? poison_obj+0x2b/0x40
> >> [ 5497.405585]  [<ffffffff811ca26e>] do_filp_open+0x3e/0xa0
> >> [ 5497.405585]  [<ffffffff811d7da1>] ? __alloc_fd+0xd1/0x120
> >> [ 5497.405585]  [<ffffffff811b95cc>] do_sys_open+0x13c/0x230
> >> [ 5497.405585]  [<ffffffff810f5bbd>] ? trace_hardirqs_on_caller+0x10d/0x1d0
> >> [ 5497.405585]  [<ffffffff811b96dd>] SyS_open+0x1d/0x20
> >> [ 5497.405585]  [<ffffffff81d9f5a9>] system_call_fastpath+0x16/0x1b
> >>
> >
> > Odd...
> >
> > It looks like you hit the BUG() in d_instantiate.
> >
> > I don't see any calls to d_instantiate in the current nfs_code (aside
> > from the one in nfs_lookup, and I don't think that's getting called
> > here). d_instantiate is called by d_add though, and that gets called
> > from nfs_atomic_open. Is that what happened here?
> >
> > Maybe you can use gdb to figure out what line of code
> > "nfs4_do_open.constprop.84+0x681" actually is?
> >
> 
> Hi Jeff,
> 
> I took a look at this last night. I suspect that the problem is with
> error handling.
> 
> So the problem is that in the current code, we can end up attaching
> the dentry to the dcache using d_add() in _nfs4_open_and_get_state. I
> suspect that we're hitting an error _after_ that call, that is causing
> the callers of _nfs4_do_open() to loop and retry the open (and thus
> call d_add a second time).
> I haven't had time to investigate where this might be happening, but
> feel free to look...
> 
> Cheers
>   Trond
> 

(readding Neil to cc list)

Ahh ok, makes sense. nfs_atomic_open does this early on:

        BUG_ON(dentry->d_inode);

...so dentry should have been negative when we called into here
originally. That implies that something has already turned it
into a positive dentry here before the d_drop/d_add got called.

Still, turning the d_drop/d_add into a d_delete seems like it might be
the right thing to do there instead. The code is trying to turn this
into a negative dentry and that's what d_delete is supposed to be for.

-- 
Jeff Layton <jlayton@primarydata.com>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: kernel BUG in fs/dcache.c running nfs
  2014-09-09 16:15       ` Jeff Layton
@ 2014-09-09 17:50         ` Jeff Layton
  2014-09-10  3:57           ` NeilBrown
  0 siblings, 1 reply; 13+ messages in thread
From: Jeff Layton @ 2014-09-09 17:50 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Christoph Hellwig, linux-nfs, neilb

On Tue, 9 Sep 2014 12:15:46 -0400
Jeff Layton <jeff.layton@primarydata.com> wrote:

> On Tue, 9 Sep 2014 12:12:44 -0400
> Jeff Layton <jlayton@primarydata.com> wrote:
> 
> > On Tue, 9 Sep 2014 08:42:11 -0700
> > Christoph Hellwig <hch@infradead.org> wrote:
> > 
> > > On Tue, Sep 09, 2014 at 10:59:18AM -0400, Jeff Layton wrote:
> > > > > [ 5497.405585]  [<ffffffff8135d0c1>] nfs4_do_open.constprop.84+0x681/0x960
> > > > > [ 5497.405585]  [<ffffffff8135d459>] nfs4_atomic_open+0x9/0x20
> > > > > [ 5497.405585]  [<ffffffff8136cc3d>] nfs4_file_open+0xcd/0x1b0
> > > > > [ 5497.405585]  [<ffffffff811b8092>] do_dentry_open.isra.13+0x1f2/0x320
> > > > > [ 5497.405585]  [<ffffffff811b82ad>] finish_open+0x1d/0x30
> > > > > [ 5497.405585]  [<ffffffff811c98e9>] path_openat+0xb9/0x670
> > > > > [ 5497.405585]  [<ffffffff811ca26e>] do_filp_open+0x3e/0xa0
> > > > > [ 5497.405585]  [<ffffffff811b95cc>] do_sys_open+0x13c/0x230
> > > > > [ 5497.405585]  [<ffffffff811b96dd>] SyS_open+0x1d/0x20
> > > > > [ 5497.405585]  [<ffffffff81d9f5a9>] system_call_fastpath+0x16/0x1b
> > > > > 
> > > > 
> > > > Odd...
> > > > 
> > > > It looks like you hit the BUG() in d_instantiate.
> > > > 
> > > > I don't see any calls to d_instantiate in the current nfs_code (aside
> > > > from the one in nfs_lookup, and I don't think that's getting called
> > > > here). d_instantiate is called by d_add though, and that gets called
> > > > from nfs_atomic_open. Is that what happened here?
> > > > 
> > > > Maybe you can use gdb to figure out what line of code
> > > > "nfs4_do_open.constprop.84+0x681" actually is?
> > > 
> > > My gdb can't cope with these constprop expressions unfortunately.
> > > 
> > > But when you remove the questionable symbols as I've done above it's
> > > pretty clear that this must be the
> > > 
> > > nfs4_atomic_open
> > >  -> nfs4_do_open
> > >    -> _nfs4_do_open
> > >      -> _nfs4_open_and_get_state
> > >        -> d_add
> > >          -> d_instantiate
> > > 
> > > call chain. There is heavy inlining going on inside nfs4file.c, and
> > > d_add now is a simple inline around d_instantiate and d_rehash.
> > 
> > Ok. So I'm guessing that means that the current scheme of doing a
> > d_drop/d_add is not valid. d_drop doesn't remove the dentry from the
> > i_alias list.
> > 
> > It looks like the first call there should just be doing a d_delete
> > instead, since it's trying to turn the thing into a negative dentry.
> 
> (cc'ing Neil B.)
> 
> ...and I'd hazard a guess that 4fa2c54b5198d might be the culprit. You
> might want to try backing that out and see if it's still reproducible.
> 
> Neil, any thoughts?
> 

In fact, maybe this patch would make sense?

------------------[snip]-------------------

[PATCH] nfs: d_drop/d_add of positive dentry may be harmful

Christoph reported the following oops, when running xfstests:

generic/089 199s ...
[ 5497.402293] ------------[ cut here ]------------
[ 5497.403150] kernel BUG at ../fs/dcache.c:1620!
[ 5497.403974] invalid opcode: 0000 [#1] SMP
[ 5497.404826] Modules linked in:
[ 5497.405280] CPU: 1 PID: 14691 Comm: t_mtab Not tainted 3.17.0-rc3+ #264
[ 5497.405585] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[ 5497.405585] task: ffff88007ac801d0 ti: ffff8800670a4000 task.ti: ffff8800670a4000
[ 5497.405585] RIP: 0010:[<ffffffff811d1345>]  [<ffffffff811d1345>] d_instantiate+0x75/0x80
[ 5497.405585] RSP: 0018:ffff8800670a7a68  EFLAGS: 00010286
[ 5497.405585] RAX: 0000000000000001 RBX: ffff880066c399d8 RCX: ffff88007ac80990
[ 5497.405585] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880066c399d8
[ 5497.405585] RBP: ffff8800670a7a88 R08: 0000000000000001 R09: 0000000000000000
[ 5497.405585] R10: ffff880072f45000 R11: 000000000003fdf0 R12: ffff880066c399d8
[ 5497.405585] R13: ffff88007a684800 R14: ffff88007acbc280 R15: ffff8800670de000
[ 5497.405585] FS:  00007f6db6aae700(0000) GS:ffff88007fc80000(0000) knlGS:0000000000000000
[ 5497.405585] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 5497.405585] CR2: 00007f6db5f56800 CR3: 000000007ac9e000 CR4: 00000000000006e0
[ 5497.405585] Stack:
[ 5497.405585]  ffff8800670a7a88 ffff880066c399d8 ffff88007a684800 ffff88007a684800
[ 5497.405585]  ffff8800670a7b68 ffffffff8135d0c1 ffffffff00000004 ffff8800000000d0
[ 5497.405585]  ffff88007d400180 0000000000000246 ffff8800fffffffe ffff880072f45000
[ 5497.405585] Call Trace:
[ 5497.405585]  [<ffffffff8135d0c1>] nfs4_do_open.constprop.84+0x681/0x960
[ 5497.405585]  [<ffffffff8135d459>] nfs4_atomic_open+0x9/0x20
[ 5497.405585]  [<ffffffff8136cc3d>] nfs4_file_open+0xcd/0x1b0
[ 5497.405585]  [<ffffffff81d9ee56>] ? _raw_spin_unlock+0x26/0x30
[ 5497.405585]  [<ffffffff8176b73d>] ? lockref_get+0x1d/0x30
[ 5497.405585]  [<ffffffff8136cb70>] ? nfs4_file_fsync+0xb0/0xb0
[ 5497.405585]  [<ffffffff811b8092>] do_dentry_open.isra.13+0x1f2/0x320
[ 5497.405585]  [<ffffffff81336f52>] ? nfs_permission+0x62/0x1d0
[ 5497.405585]  [<ffffffff811b82ad>] finish_open+0x1d/0x30
[ 5497.405585]  [<ffffffff811c91be>] do_last.isra.63+0x62e/0xca0
[ 5497.405585]  [<ffffffff811c5703>] ? inode_permission+0x13/0x50
[ 5497.405585]  [<ffffffff811c5bde>] ? link_path_walk+0x23e/0x850
[ 5497.405585]  [<ffffffff811c98e9>] path_openat+0xb9/0x670
[ 5497.405585]  [<ffffffff811b366b>] ? poison_obj+0x2b/0x40
[ 5497.405585]  [<ffffffff811ca26e>] do_filp_open+0x3e/0xa0
[ 5497.405585]  [<ffffffff811d7da1>] ? __alloc_fd+0xd1/0x120
[ 5497.405585]  [<ffffffff811b95cc>] do_sys_open+0x13c/0x230
[ 5497.405585]  [<ffffffff810f5bbd>] ? trace_hardirqs_on_caller+0x10d/0x1d0
[ 5497.405585]  [<ffffffff811b96dd>] SyS_open+0x1d/0x20
[ 5497.405585]  [<ffffffff81d9f5a9>] system_call_fastpath+0x16/0x1b

The BUG() is due to the fact that the d_alias hlist is not empty when
we called into d_instantiate. This is likely due to a situation where
we did a successful open and instantiated the dentry and then later
failed and ended up retrying. At that point, we try the open again and
get back -ENOENT, and try to d_drop/d_add it.

The problem is that d_drop'ing a positive dentry is not sufficient to
"clear" it for adding it back into the cache. That just makes it unfindable
in the hash tables, but doesn't unhitch it from the inode.

Switch to using the helper we already have for turning positive dentries
into negative ones.

Cc: Neil Brown <neilb@suse.de>
Reported-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
---
 fs/nfs/dir.c      | 3 ++-
 fs/nfs/internal.h | 1 +
 fs/nfs/nfs4proc.c | 3 +--
 3 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 36d921f0c602..3938dba859c5 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -1754,11 +1754,12 @@ out_err:
 }
 EXPORT_SYMBOL_GPL(nfs_mkdir);
 
-static void nfs_dentry_handle_enoent(struct dentry *dentry)
+void nfs_dentry_handle_enoent(struct dentry *dentry)
 {
 	if (dentry->d_inode != NULL && !d_unhashed(dentry))
 		d_delete(dentry);
 }
+EXPORT_SYMBOL_GPL(nfs_dentry_handle_enoent);
 
 int nfs_rmdir(struct inode *dir, struct dentry *dentry)
 {
diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h
index 9056622d2230..8d85a57ae499 100644
--- a/fs/nfs/internal.h
+++ b/fs/nfs/internal.h
@@ -326,6 +326,7 @@ extern unsigned long nfs_access_cache_scan(struct shrinker *shrink,
 struct dentry *nfs_lookup(struct inode *, struct dentry *, unsigned int);
 int nfs_create(struct inode *, struct dentry *, umode_t, bool);
 int nfs_mkdir(struct inode *, struct dentry *, umode_t);
+void nfs_dentry_handle_enoent(struct dentry *);
 int nfs_rmdir(struct inode *, struct dentry *);
 int nfs_unlink(struct inode *, struct dentry *);
 int nfs_symlink(struct inode *, struct dentry *, const char *);
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 7dd8aca31c29..84ee3fb9e410 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -2226,8 +2226,7 @@ static int _nfs4_open_and_get_state(struct nfs4_opendata *opendata,
 	ret = _nfs4_proc_open(opendata);
 	if (ret != 0) {
 		if (ret == -ENOENT) {
-			d_drop(opendata->dentry);
-			d_add(opendata->dentry, NULL);
+			nfs_dentry_handle_enoent(opendata->dentry);
 			nfs_set_verifier(opendata->dentry,
 					 nfs_save_change_attribute(opendata->dir->d_inode));
 		}
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: kernel BUG in fs/dcache.c running nfs
  2014-09-09 17:50         ` Jeff Layton
@ 2014-09-10  3:57           ` NeilBrown
  2014-09-10 11:59             ` Jeff Layton
  2014-09-26 23:08             ` Al Viro
  0 siblings, 2 replies; 13+ messages in thread
From: NeilBrown @ 2014-09-10  3:57 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Christoph Hellwig, linux-nfs

[-- Attachment #1: Type: text/plain, Size: 9643 bytes --]

On Tue, 9 Sep 2014 13:50:06 -0400 Jeff Layton <jeff.layton@primarydata.com>
wrote:

> On Tue, 9 Sep 2014 12:15:46 -0400
> Jeff Layton <jeff.layton@primarydata.com> wrote:
> 
> > On Tue, 9 Sep 2014 12:12:44 -0400
> > Jeff Layton <jlayton@primarydata.com> wrote:
> > 
> > > On Tue, 9 Sep 2014 08:42:11 -0700
> > > Christoph Hellwig <hch@infradead.org> wrote:
> > > 
> > > > On Tue, Sep 09, 2014 at 10:59:18AM -0400, Jeff Layton wrote:
> > > > > > [ 5497.405585]  [<ffffffff8135d0c1>] nfs4_do_open.constprop.84+0x681/0x960
> > > > > > [ 5497.405585]  [<ffffffff8135d459>] nfs4_atomic_open+0x9/0x20
> > > > > > [ 5497.405585]  [<ffffffff8136cc3d>] nfs4_file_open+0xcd/0x1b0
> > > > > > [ 5497.405585]  [<ffffffff811b8092>] do_dentry_open.isra.13+0x1f2/0x320
> > > > > > [ 5497.405585]  [<ffffffff811b82ad>] finish_open+0x1d/0x30
> > > > > > [ 5497.405585]  [<ffffffff811c98e9>] path_openat+0xb9/0x670
> > > > > > [ 5497.405585]  [<ffffffff811ca26e>] do_filp_open+0x3e/0xa0
> > > > > > [ 5497.405585]  [<ffffffff811b95cc>] do_sys_open+0x13c/0x230
> > > > > > [ 5497.405585]  [<ffffffff811b96dd>] SyS_open+0x1d/0x20
> > > > > > [ 5497.405585]  [<ffffffff81d9f5a9>] system_call_fastpath+0x16/0x1b
> > > > > > 
> > > > > 
> > > > > Odd...
> > > > > 
> > > > > It looks like you hit the BUG() in d_instantiate.
> > > > > 
> > > > > I don't see any calls to d_instantiate in the current nfs_code (aside
> > > > > from the one in nfs_lookup, and I don't think that's getting called
> > > > > here). d_instantiate is called by d_add though, and that gets called
> > > > > from nfs_atomic_open. Is that what happened here?
> > > > > 
> > > > > Maybe you can use gdb to figure out what line of code
> > > > > "nfs4_do_open.constprop.84+0x681" actually is?
> > > > 
> > > > My gdb can't cope with these constprop expressions unfortunately.
> > > > 
> > > > But when you remove the questionable symbols as I've done above it's
> > > > pretty clear that this must be the
> > > > 
> > > > nfs4_atomic_open
> > > >  -> nfs4_do_open
> > > >    -> _nfs4_do_open
> > > >      -> _nfs4_open_and_get_state
> > > >        -> d_add
> > > >          -> d_instantiate
> > > > 
> > > > call chain. There is heavy inlining going on inside nfs4file.c, and
> > > > d_add now is a simple inline around d_instantiate and d_rehash.
> > > 
> > > Ok. So I'm guessing that means that the current scheme of doing a
> > > d_drop/d_add is not valid. d_drop doesn't remove the dentry from the
> > > i_alias list.
> > > 
> > > It looks like the first call there should just be doing a d_delete
> > > instead, since it's trying to turn the thing into a negative dentry.
> > 
> > (cc'ing Neil B.)
> > 
> > ...and I'd hazard a guess that 4fa2c54b5198d might be the culprit. You
> > might want to try backing that out and see if it's still reproducible.
> > 
> > Neil, any thoughts?
> > 
> 
> In fact, maybe this patch would make sense?
> 
> ------------------[snip]-------------------
> 
> [PATCH] nfs: d_drop/d_add of positive dentry may be harmful
> 
> Christoph reported the following oops, when running xfstests:
> 
> generic/089 199s ...
> [ 5497.402293] ------------[ cut here ]------------
> [ 5497.403150] kernel BUG at ../fs/dcache.c:1620!
> [ 5497.403974] invalid opcode: 0000 [#1] SMP
> [ 5497.404826] Modules linked in:
> [ 5497.405280] CPU: 1 PID: 14691 Comm: t_mtab Not tainted 3.17.0-rc3+ #264
> [ 5497.405585] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
> [ 5497.405585] task: ffff88007ac801d0 ti: ffff8800670a4000 task.ti: ffff8800670a4000
> [ 5497.405585] RIP: 0010:[<ffffffff811d1345>]  [<ffffffff811d1345>] d_instantiate+0x75/0x80
> [ 5497.405585] RSP: 0018:ffff8800670a7a68  EFLAGS: 00010286
> [ 5497.405585] RAX: 0000000000000001 RBX: ffff880066c399d8 RCX: ffff88007ac80990
> [ 5497.405585] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880066c399d8
> [ 5497.405585] RBP: ffff8800670a7a88 R08: 0000000000000001 R09: 0000000000000000
> [ 5497.405585] R10: ffff880072f45000 R11: 000000000003fdf0 R12: ffff880066c399d8
> [ 5497.405585] R13: ffff88007a684800 R14: ffff88007acbc280 R15: ffff8800670de000
> [ 5497.405585] FS:  00007f6db6aae700(0000) GS:ffff88007fc80000(0000) knlGS:0000000000000000
> [ 5497.405585] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [ 5497.405585] CR2: 00007f6db5f56800 CR3: 000000007ac9e000 CR4: 00000000000006e0
> [ 5497.405585] Stack:
> [ 5497.405585]  ffff8800670a7a88 ffff880066c399d8 ffff88007a684800 ffff88007a684800
> [ 5497.405585]  ffff8800670a7b68 ffffffff8135d0c1 ffffffff00000004 ffff8800000000d0
> [ 5497.405585]  ffff88007d400180 0000000000000246 ffff8800fffffffe ffff880072f45000
> [ 5497.405585] Call Trace:
> [ 5497.405585]  [<ffffffff8135d0c1>] nfs4_do_open.constprop.84+0x681/0x960
> [ 5497.405585]  [<ffffffff8135d459>] nfs4_atomic_open+0x9/0x20
> [ 5497.405585]  [<ffffffff8136cc3d>] nfs4_file_open+0xcd/0x1b0
> [ 5497.405585]  [<ffffffff81d9ee56>] ? _raw_spin_unlock+0x26/0x30
> [ 5497.405585]  [<ffffffff8176b73d>] ? lockref_get+0x1d/0x30
> [ 5497.405585]  [<ffffffff8136cb70>] ? nfs4_file_fsync+0xb0/0xb0
> [ 5497.405585]  [<ffffffff811b8092>] do_dentry_open.isra.13+0x1f2/0x320
> [ 5497.405585]  [<ffffffff81336f52>] ? nfs_permission+0x62/0x1d0
> [ 5497.405585]  [<ffffffff811b82ad>] finish_open+0x1d/0x30
> [ 5497.405585]  [<ffffffff811c91be>] do_last.isra.63+0x62e/0xca0
> [ 5497.405585]  [<ffffffff811c5703>] ? inode_permission+0x13/0x50
> [ 5497.405585]  [<ffffffff811c5bde>] ? link_path_walk+0x23e/0x850
> [ 5497.405585]  [<ffffffff811c98e9>] path_openat+0xb9/0x670
> [ 5497.405585]  [<ffffffff811b366b>] ? poison_obj+0x2b/0x40
> [ 5497.405585]  [<ffffffff811ca26e>] do_filp_open+0x3e/0xa0
> [ 5497.405585]  [<ffffffff811d7da1>] ? __alloc_fd+0xd1/0x120
> [ 5497.405585]  [<ffffffff811b95cc>] do_sys_open+0x13c/0x230
> [ 5497.405585]  [<ffffffff810f5bbd>] ? trace_hardirqs_on_caller+0x10d/0x1d0
> [ 5497.405585]  [<ffffffff811b96dd>] SyS_open+0x1d/0x20
> [ 5497.405585]  [<ffffffff81d9f5a9>] system_call_fastpath+0x16/0x1b
> 
> The BUG() is due to the fact that the d_alias hlist is not empty when
> we called into d_instantiate. This is likely due to a situation where
> we did a successful open and instantiated the dentry and then later
> failed and ended up retrying. At that point, we try the open again and
> get back -ENOENT, and try to d_drop/d_add it.
> 
> The problem is that d_drop'ing a positive dentry is not sufficient to
> "clear" it for adding it back into the cache. That just makes it unfindable
> in the hash tables, but doesn't unhitch it from the inode.
> 
> Switch to using the helper we already have for turning positive dentries
> into negative ones.
> 
> Cc: Neil Brown <neilb@suse.de>
> Reported-by: Christoph Hellwig <hch@infradead.org>
> Signed-off-by: Jeff Layton <jlayton@primarydata.com>
> ---
>  fs/nfs/dir.c      | 3 ++-
>  fs/nfs/internal.h | 1 +
>  fs/nfs/nfs4proc.c | 3 +--
>  3 files changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
> index 36d921f0c602..3938dba859c5 100644
> --- a/fs/nfs/dir.c
> +++ b/fs/nfs/dir.c
> @@ -1754,11 +1754,12 @@ out_err:
>  }
>  EXPORT_SYMBOL_GPL(nfs_mkdir);
>  
> -static void nfs_dentry_handle_enoent(struct dentry *dentry)
> +void nfs_dentry_handle_enoent(struct dentry *dentry)
>  {
>  	if (dentry->d_inode != NULL && !d_unhashed(dentry))
>  		d_delete(dentry);
>  }
> +EXPORT_SYMBOL_GPL(nfs_dentry_handle_enoent);
>  
>  int nfs_rmdir(struct inode *dir, struct dentry *dentry)
>  {
> diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h
> index 9056622d2230..8d85a57ae499 100644
> --- a/fs/nfs/internal.h
> +++ b/fs/nfs/internal.h
> @@ -326,6 +326,7 @@ extern unsigned long nfs_access_cache_scan(struct shrinker *shrink,
>  struct dentry *nfs_lookup(struct inode *, struct dentry *, unsigned int);
>  int nfs_create(struct inode *, struct dentry *, umode_t, bool);
>  int nfs_mkdir(struct inode *, struct dentry *, umode_t);
> +void nfs_dentry_handle_enoent(struct dentry *);
>  int nfs_rmdir(struct inode *, struct dentry *);
>  int nfs_unlink(struct inode *, struct dentry *);
>  int nfs_symlink(struct inode *, struct dentry *, const char *);
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index 7dd8aca31c29..84ee3fb9e410 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -2226,8 +2226,7 @@ static int _nfs4_open_and_get_state(struct nfs4_opendata *opendata,
>  	ret = _nfs4_proc_open(opendata);
>  	if (ret != 0) {
>  		if (ret == -ENOENT) {
> -			d_drop(opendata->dentry);
> -			d_add(opendata->dentry, NULL);
> +			nfs_dentry_handle_enoent(opendata->dentry);
>  			nfs_set_verifier(opendata->dentry,
>  					 nfs_save_change_attribute(opendata->dir->d_inode));
>  		}


The d_drop();d_add(); pattern is used a number of times in NFS, but what I
didn't notice before is that it is only used if ->d_inode is NULL.

We *could* simply change the condition to
   if (ret == -ENOENT && opendata->dentry->d_inode == NULL) {

Calling nfs_dentry_handle_enoent() is possibly a good idea, but we do want to
still call d_add().  The whole point of the patch which introduce that code
was to ensure that negative entries were cached.
nfs_dentry_handle_enoent()  is always enough to prepare for d_add(), so we
cannot just replace d_drop with that.

Maybe:

  if (ret == -ENOENT) {
	struct dentry *de = opendata->dentry;
	d_drop(de->d_inode);
        if (de->d_inode == NULL)
		d_add(de, NULL);
        nfs_set_verifier(de, .....);
  }

??

Thanks,
NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: kernel BUG in fs/dcache.c running nfs
  2014-09-10  3:57           ` NeilBrown
@ 2014-09-10 11:59             ` Jeff Layton
  2014-09-11  6:20               ` NeilBrown
  2014-09-26 23:08             ` Al Viro
  1 sibling, 1 reply; 13+ messages in thread
From: Jeff Layton @ 2014-09-10 11:59 UTC (permalink / raw)
  To: NeilBrown; +Cc: Christoph Hellwig, linux-nfs

[-- Attachment #1: Type: text/plain, Size: 10402 bytes --]

On Wed, 10 Sep 2014 13:57:39 +1000
NeilBrown <neilb@suse.de> wrote:

> On Tue, 9 Sep 2014 13:50:06 -0400 Jeff Layton <jeff.layton@primarydata.com>
> wrote:
> 
> > On Tue, 9 Sep 2014 12:15:46 -0400
> > Jeff Layton <jeff.layton@primarydata.com> wrote:
> > 
> > > On Tue, 9 Sep 2014 12:12:44 -0400
> > > Jeff Layton <jlayton@primarydata.com> wrote:
> > > 
> > > > On Tue, 9 Sep 2014 08:42:11 -0700
> > > > Christoph Hellwig <hch@infradead.org> wrote:
> > > > 
> > > > > On Tue, Sep 09, 2014 at 10:59:18AM -0400, Jeff Layton wrote:
> > > > > > > [ 5497.405585]  [<ffffffff8135d0c1>] nfs4_do_open.constprop.84+0x681/0x960
> > > > > > > [ 5497.405585]  [<ffffffff8135d459>] nfs4_atomic_open+0x9/0x20
> > > > > > > [ 5497.405585]  [<ffffffff8136cc3d>] nfs4_file_open+0xcd/0x1b0
> > > > > > > [ 5497.405585]  [<ffffffff811b8092>] do_dentry_open.isra.13+0x1f2/0x320
> > > > > > > [ 5497.405585]  [<ffffffff811b82ad>] finish_open+0x1d/0x30
> > > > > > > [ 5497.405585]  [<ffffffff811c98e9>] path_openat+0xb9/0x670
> > > > > > > [ 5497.405585]  [<ffffffff811ca26e>] do_filp_open+0x3e/0xa0
> > > > > > > [ 5497.405585]  [<ffffffff811b95cc>] do_sys_open+0x13c/0x230
> > > > > > > [ 5497.405585]  [<ffffffff811b96dd>] SyS_open+0x1d/0x20
> > > > > > > [ 5497.405585]  [<ffffffff81d9f5a9>] system_call_fastpath+0x16/0x1b
> > > > > > > 
> > > > > > 
> > > > > > Odd...
> > > > > > 
> > > > > > It looks like you hit the BUG() in d_instantiate.
> > > > > > 
> > > > > > I don't see any calls to d_instantiate in the current nfs_code (aside
> > > > > > from the one in nfs_lookup, and I don't think that's getting called
> > > > > > here). d_instantiate is called by d_add though, and that gets called
> > > > > > from nfs_atomic_open. Is that what happened here?
> > > > > > 
> > > > > > Maybe you can use gdb to figure out what line of code
> > > > > > "nfs4_do_open.constprop.84+0x681" actually is?
> > > > > 
> > > > > My gdb can't cope with these constprop expressions unfortunately.
> > > > > 
> > > > > But when you remove the questionable symbols as I've done above it's
> > > > > pretty clear that this must be the
> > > > > 
> > > > > nfs4_atomic_open
> > > > >  -> nfs4_do_open
> > > > >    -> _nfs4_do_open
> > > > >      -> _nfs4_open_and_get_state
> > > > >        -> d_add
> > > > >          -> d_instantiate
> > > > > 
> > > > > call chain. There is heavy inlining going on inside nfs4file.c, and
> > > > > d_add now is a simple inline around d_instantiate and d_rehash.
> > > > 
> > > > Ok. So I'm guessing that means that the current scheme of doing a
> > > > d_drop/d_add is not valid. d_drop doesn't remove the dentry from the
> > > > i_alias list.
> > > > 
> > > > It looks like the first call there should just be doing a d_delete
> > > > instead, since it's trying to turn the thing into a negative dentry.
> > > 
> > > (cc'ing Neil B.)
> > > 
> > > ...and I'd hazard a guess that 4fa2c54b5198d might be the culprit. You
> > > might want to try backing that out and see if it's still reproducible.
> > > 
> > > Neil, any thoughts?
> > > 
> > 
> > In fact, maybe this patch would make sense?
> > 
> > ------------------[snip]-------------------
> > 
> > [PATCH] nfs: d_drop/d_add of positive dentry may be harmful
> > 
> > Christoph reported the following oops, when running xfstests:
> > 
> > generic/089 199s ...
> > [ 5497.402293] ------------[ cut here ]------------
> > [ 5497.403150] kernel BUG at ../fs/dcache.c:1620!
> > [ 5497.403974] invalid opcode: 0000 [#1] SMP
> > [ 5497.404826] Modules linked in:
> > [ 5497.405280] CPU: 1 PID: 14691 Comm: t_mtab Not tainted 3.17.0-rc3+ #264
> > [ 5497.405585] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
> > [ 5497.405585] task: ffff88007ac801d0 ti: ffff8800670a4000 task.ti: ffff8800670a4000
> > [ 5497.405585] RIP: 0010:[<ffffffff811d1345>]  [<ffffffff811d1345>] d_instantiate+0x75/0x80
> > [ 5497.405585] RSP: 0018:ffff8800670a7a68  EFLAGS: 00010286
> > [ 5497.405585] RAX: 0000000000000001 RBX: ffff880066c399d8 RCX: ffff88007ac80990
> > [ 5497.405585] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880066c399d8
> > [ 5497.405585] RBP: ffff8800670a7a88 R08: 0000000000000001 R09: 0000000000000000
> > [ 5497.405585] R10: ffff880072f45000 R11: 000000000003fdf0 R12: ffff880066c399d8
> > [ 5497.405585] R13: ffff88007a684800 R14: ffff88007acbc280 R15: ffff8800670de000
> > [ 5497.405585] FS:  00007f6db6aae700(0000) GS:ffff88007fc80000(0000) knlGS:0000000000000000
> > [ 5497.405585] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> > [ 5497.405585] CR2: 00007f6db5f56800 CR3: 000000007ac9e000 CR4: 00000000000006e0
> > [ 5497.405585] Stack:
> > [ 5497.405585]  ffff8800670a7a88 ffff880066c399d8 ffff88007a684800 ffff88007a684800
> > [ 5497.405585]  ffff8800670a7b68 ffffffff8135d0c1 ffffffff00000004 ffff8800000000d0
> > [ 5497.405585]  ffff88007d400180 0000000000000246 ffff8800fffffffe ffff880072f45000
> > [ 5497.405585] Call Trace:
> > [ 5497.405585]  [<ffffffff8135d0c1>] nfs4_do_open.constprop.84+0x681/0x960
> > [ 5497.405585]  [<ffffffff8135d459>] nfs4_atomic_open+0x9/0x20
> > [ 5497.405585]  [<ffffffff8136cc3d>] nfs4_file_open+0xcd/0x1b0
> > [ 5497.405585]  [<ffffffff81d9ee56>] ? _raw_spin_unlock+0x26/0x30
> > [ 5497.405585]  [<ffffffff8176b73d>] ? lockref_get+0x1d/0x30
> > [ 5497.405585]  [<ffffffff8136cb70>] ? nfs4_file_fsync+0xb0/0xb0
> > [ 5497.405585]  [<ffffffff811b8092>] do_dentry_open.isra.13+0x1f2/0x320
> > [ 5497.405585]  [<ffffffff81336f52>] ? nfs_permission+0x62/0x1d0
> > [ 5497.405585]  [<ffffffff811b82ad>] finish_open+0x1d/0x30
> > [ 5497.405585]  [<ffffffff811c91be>] do_last.isra.63+0x62e/0xca0
> > [ 5497.405585]  [<ffffffff811c5703>] ? inode_permission+0x13/0x50
> > [ 5497.405585]  [<ffffffff811c5bde>] ? link_path_walk+0x23e/0x850
> > [ 5497.405585]  [<ffffffff811c98e9>] path_openat+0xb9/0x670
> > [ 5497.405585]  [<ffffffff811b366b>] ? poison_obj+0x2b/0x40
> > [ 5497.405585]  [<ffffffff811ca26e>] do_filp_open+0x3e/0xa0
> > [ 5497.405585]  [<ffffffff811d7da1>] ? __alloc_fd+0xd1/0x120
> > [ 5497.405585]  [<ffffffff811b95cc>] do_sys_open+0x13c/0x230
> > [ 5497.405585]  [<ffffffff810f5bbd>] ? trace_hardirqs_on_caller+0x10d/0x1d0
> > [ 5497.405585]  [<ffffffff811b96dd>] SyS_open+0x1d/0x20
> > [ 5497.405585]  [<ffffffff81d9f5a9>] system_call_fastpath+0x16/0x1b
> > 
> > The BUG() is due to the fact that the d_alias hlist is not empty when
> > we called into d_instantiate. This is likely due to a situation where
> > we did a successful open and instantiated the dentry and then later
> > failed and ended up retrying. At that point, we try the open again and
> > get back -ENOENT, and try to d_drop/d_add it.
> > 
> > The problem is that d_drop'ing a positive dentry is not sufficient to
> > "clear" it for adding it back into the cache. That just makes it unfindable
> > in the hash tables, but doesn't unhitch it from the inode.
> > 
> > Switch to using the helper we already have for turning positive dentries
> > into negative ones.
> > 
> > Cc: Neil Brown <neilb@suse.de>
> > Reported-by: Christoph Hellwig <hch@infradead.org>
> > Signed-off-by: Jeff Layton <jlayton@primarydata.com>
> > ---
> >  fs/nfs/dir.c      | 3 ++-
> >  fs/nfs/internal.h | 1 +
> >  fs/nfs/nfs4proc.c | 3 +--
> >  3 files changed, 4 insertions(+), 3 deletions(-)
> > 
> > diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
> > index 36d921f0c602..3938dba859c5 100644
> > --- a/fs/nfs/dir.c
> > +++ b/fs/nfs/dir.c
> > @@ -1754,11 +1754,12 @@ out_err:
> >  }
> >  EXPORT_SYMBOL_GPL(nfs_mkdir);
> >  
> > -static void nfs_dentry_handle_enoent(struct dentry *dentry)
> > +void nfs_dentry_handle_enoent(struct dentry *dentry)
> >  {
> >  	if (dentry->d_inode != NULL && !d_unhashed(dentry))
> >  		d_delete(dentry);
> >  }
> > +EXPORT_SYMBOL_GPL(nfs_dentry_handle_enoent);
> >  
> >  int nfs_rmdir(struct inode *dir, struct dentry *dentry)
> >  {
> > diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h
> > index 9056622d2230..8d85a57ae499 100644
> > --- a/fs/nfs/internal.h
> > +++ b/fs/nfs/internal.h
> > @@ -326,6 +326,7 @@ extern unsigned long nfs_access_cache_scan(struct shrinker *shrink,
> >  struct dentry *nfs_lookup(struct inode *, struct dentry *, unsigned int);
> >  int nfs_create(struct inode *, struct dentry *, umode_t, bool);
> >  int nfs_mkdir(struct inode *, struct dentry *, umode_t);
> > +void nfs_dentry_handle_enoent(struct dentry *);
> >  int nfs_rmdir(struct inode *, struct dentry *);
> >  int nfs_unlink(struct inode *, struct dentry *);
> >  int nfs_symlink(struct inode *, struct dentry *, const char *);
> > diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> > index 7dd8aca31c29..84ee3fb9e410 100644
> > --- a/fs/nfs/nfs4proc.c
> > +++ b/fs/nfs/nfs4proc.c
> > @@ -2226,8 +2226,7 @@ static int _nfs4_open_and_get_state(struct nfs4_opendata *opendata,
> >  	ret = _nfs4_proc_open(opendata);
> >  	if (ret != 0) {
> >  		if (ret == -ENOENT) {
> > -			d_drop(opendata->dentry);
> > -			d_add(opendata->dentry, NULL);
> > +			nfs_dentry_handle_enoent(opendata->dentry);
> >  			nfs_set_verifier(opendata->dentry,
> >  					 nfs_save_change_attribute(opendata->dir->d_inode));
> >  		}
> 
> 
> The d_drop();d_add(); pattern is used a number of times in NFS, but what I
> didn't notice before is that it is only used if ->d_inode is NULL.
> 
> We *could* simply change the condition to
>    if (ret == -ENOENT && opendata->dentry->d_inode == NULL) {
> 
> Calling nfs_dentry_handle_enoent() is possibly a good idea, but we do want to
> still call d_add().  The whole point of the patch which introduce that code
> was to ensure that negative entries were cached.
> nfs_dentry_handle_enoent()  is always enough to prepare for d_add(), so we
> cannot just replace d_drop with that.
> 
> Maybe:
> 
>   if (ret == -ENOENT) {
> 	struct dentry *de = opendata->dentry;
> 	d_drop(de->d_inode);
>         if (de->d_inode == NULL)
> 		d_add(de, NULL);
>         nfs_set_verifier(de, .....);
>   }
> 

Yeah using nfs_dentry_handle_enoent won't hash a brand new negative
dentry. The above code though won't convert an existing positive dentry
into a negative one. I guess though that's a rare enough situation that
it's not worth optimizing for?

-- 
Jeff Layton <jlayton@primarydata.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: kernel BUG in fs/dcache.c running nfs
  2014-09-10 11:59             ` Jeff Layton
@ 2014-09-11  6:20               ` NeilBrown
  0 siblings, 0 replies; 13+ messages in thread
From: NeilBrown @ 2014-09-11  6:20 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Christoph Hellwig, linux-nfs

[-- Attachment #1: Type: text/plain, Size: 11008 bytes --]

On Wed, 10 Sep 2014 07:59:43 -0400 Jeff Layton <jeff.layton@primarydata.com>
wrote:

> On Wed, 10 Sep 2014 13:57:39 +1000
> NeilBrown <neilb@suse.de> wrote:
> 
> > On Tue, 9 Sep 2014 13:50:06 -0400 Jeff Layton <jeff.layton@primarydata.com>
> > wrote:
> > 
> > > On Tue, 9 Sep 2014 12:15:46 -0400
> > > Jeff Layton <jeff.layton@primarydata.com> wrote:
> > > 
> > > > On Tue, 9 Sep 2014 12:12:44 -0400
> > > > Jeff Layton <jlayton@primarydata.com> wrote:
> > > > 
> > > > > On Tue, 9 Sep 2014 08:42:11 -0700
> > > > > Christoph Hellwig <hch@infradead.org> wrote:
> > > > > 
> > > > > > On Tue, Sep 09, 2014 at 10:59:18AM -0400, Jeff Layton wrote:
> > > > > > > > [ 5497.405585]  [<ffffffff8135d0c1>] nfs4_do_open.constprop.84+0x681/0x960
> > > > > > > > [ 5497.405585]  [<ffffffff8135d459>] nfs4_atomic_open+0x9/0x20
> > > > > > > > [ 5497.405585]  [<ffffffff8136cc3d>] nfs4_file_open+0xcd/0x1b0
> > > > > > > > [ 5497.405585]  [<ffffffff811b8092>] do_dentry_open.isra.13+0x1f2/0x320
> > > > > > > > [ 5497.405585]  [<ffffffff811b82ad>] finish_open+0x1d/0x30
> > > > > > > > [ 5497.405585]  [<ffffffff811c98e9>] path_openat+0xb9/0x670
> > > > > > > > [ 5497.405585]  [<ffffffff811ca26e>] do_filp_open+0x3e/0xa0
> > > > > > > > [ 5497.405585]  [<ffffffff811b95cc>] do_sys_open+0x13c/0x230
> > > > > > > > [ 5497.405585]  [<ffffffff811b96dd>] SyS_open+0x1d/0x20
> > > > > > > > [ 5497.405585]  [<ffffffff81d9f5a9>] system_call_fastpath+0x16/0x1b
> > > > > > > > 
> > > > > > > 
> > > > > > > Odd...
> > > > > > > 
> > > > > > > It looks like you hit the BUG() in d_instantiate.
> > > > > > > 
> > > > > > > I don't see any calls to d_instantiate in the current nfs_code (aside
> > > > > > > from the one in nfs_lookup, and I don't think that's getting called
> > > > > > > here). d_instantiate is called by d_add though, and that gets called
> > > > > > > from nfs_atomic_open. Is that what happened here?
> > > > > > > 
> > > > > > > Maybe you can use gdb to figure out what line of code
> > > > > > > "nfs4_do_open.constprop.84+0x681" actually is?
> > > > > > 
> > > > > > My gdb can't cope with these constprop expressions unfortunately.
> > > > > > 
> > > > > > But when you remove the questionable symbols as I've done above it's
> > > > > > pretty clear that this must be the
> > > > > > 
> > > > > > nfs4_atomic_open
> > > > > >  -> nfs4_do_open
> > > > > >    -> _nfs4_do_open
> > > > > >      -> _nfs4_open_and_get_state
> > > > > >        -> d_add
> > > > > >          -> d_instantiate
> > > > > > 
> > > > > > call chain. There is heavy inlining going on inside nfs4file.c, and
> > > > > > d_add now is a simple inline around d_instantiate and d_rehash.
> > > > > 
> > > > > Ok. So I'm guessing that means that the current scheme of doing a
> > > > > d_drop/d_add is not valid. d_drop doesn't remove the dentry from the
> > > > > i_alias list.
> > > > > 
> > > > > It looks like the first call there should just be doing a d_delete
> > > > > instead, since it's trying to turn the thing into a negative dentry.
> > > > 
> > > > (cc'ing Neil B.)
> > > > 
> > > > ...and I'd hazard a guess that 4fa2c54b5198d might be the culprit. You
> > > > might want to try backing that out and see if it's still reproducible.
> > > > 
> > > > Neil, any thoughts?
> > > > 
> > > 
> > > In fact, maybe this patch would make sense?
> > > 
> > > ------------------[snip]-------------------
> > > 
> > > [PATCH] nfs: d_drop/d_add of positive dentry may be harmful
> > > 
> > > Christoph reported the following oops, when running xfstests:
> > > 
> > > generic/089 199s ...
> > > [ 5497.402293] ------------[ cut here ]------------
> > > [ 5497.403150] kernel BUG at ../fs/dcache.c:1620!
> > > [ 5497.403974] invalid opcode: 0000 [#1] SMP
> > > [ 5497.404826] Modules linked in:
> > > [ 5497.405280] CPU: 1 PID: 14691 Comm: t_mtab Not tainted 3.17.0-rc3+ #264
> > > [ 5497.405585] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
> > > [ 5497.405585] task: ffff88007ac801d0 ti: ffff8800670a4000 task.ti: ffff8800670a4000
> > > [ 5497.405585] RIP: 0010:[<ffffffff811d1345>]  [<ffffffff811d1345>] d_instantiate+0x75/0x80
> > > [ 5497.405585] RSP: 0018:ffff8800670a7a68  EFLAGS: 00010286
> > > [ 5497.405585] RAX: 0000000000000001 RBX: ffff880066c399d8 RCX: ffff88007ac80990
> > > [ 5497.405585] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880066c399d8
> > > [ 5497.405585] RBP: ffff8800670a7a88 R08: 0000000000000001 R09: 0000000000000000
> > > [ 5497.405585] R10: ffff880072f45000 R11: 000000000003fdf0 R12: ffff880066c399d8
> > > [ 5497.405585] R13: ffff88007a684800 R14: ffff88007acbc280 R15: ffff8800670de000
> > > [ 5497.405585] FS:  00007f6db6aae700(0000) GS:ffff88007fc80000(0000) knlGS:0000000000000000
> > > [ 5497.405585] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> > > [ 5497.405585] CR2: 00007f6db5f56800 CR3: 000000007ac9e000 CR4: 00000000000006e0
> > > [ 5497.405585] Stack:
> > > [ 5497.405585]  ffff8800670a7a88 ffff880066c399d8 ffff88007a684800 ffff88007a684800
> > > [ 5497.405585]  ffff8800670a7b68 ffffffff8135d0c1 ffffffff00000004 ffff8800000000d0
> > > [ 5497.405585]  ffff88007d400180 0000000000000246 ffff8800fffffffe ffff880072f45000
> > > [ 5497.405585] Call Trace:
> > > [ 5497.405585]  [<ffffffff8135d0c1>] nfs4_do_open.constprop.84+0x681/0x960
> > > [ 5497.405585]  [<ffffffff8135d459>] nfs4_atomic_open+0x9/0x20
> > > [ 5497.405585]  [<ffffffff8136cc3d>] nfs4_file_open+0xcd/0x1b0
> > > [ 5497.405585]  [<ffffffff81d9ee56>] ? _raw_spin_unlock+0x26/0x30
> > > [ 5497.405585]  [<ffffffff8176b73d>] ? lockref_get+0x1d/0x30
> > > [ 5497.405585]  [<ffffffff8136cb70>] ? nfs4_file_fsync+0xb0/0xb0
> > > [ 5497.405585]  [<ffffffff811b8092>] do_dentry_open.isra.13+0x1f2/0x320
> > > [ 5497.405585]  [<ffffffff81336f52>] ? nfs_permission+0x62/0x1d0
> > > [ 5497.405585]  [<ffffffff811b82ad>] finish_open+0x1d/0x30
> > > [ 5497.405585]  [<ffffffff811c91be>] do_last.isra.63+0x62e/0xca0
> > > [ 5497.405585]  [<ffffffff811c5703>] ? inode_permission+0x13/0x50
> > > [ 5497.405585]  [<ffffffff811c5bde>] ? link_path_walk+0x23e/0x850
> > > [ 5497.405585]  [<ffffffff811c98e9>] path_openat+0xb9/0x670
> > > [ 5497.405585]  [<ffffffff811b366b>] ? poison_obj+0x2b/0x40
> > > [ 5497.405585]  [<ffffffff811ca26e>] do_filp_open+0x3e/0xa0
> > > [ 5497.405585]  [<ffffffff811d7da1>] ? __alloc_fd+0xd1/0x120
> > > [ 5497.405585]  [<ffffffff811b95cc>] do_sys_open+0x13c/0x230
> > > [ 5497.405585]  [<ffffffff810f5bbd>] ? trace_hardirqs_on_caller+0x10d/0x1d0
> > > [ 5497.405585]  [<ffffffff811b96dd>] SyS_open+0x1d/0x20
> > > [ 5497.405585]  [<ffffffff81d9f5a9>] system_call_fastpath+0x16/0x1b
> > > 
> > > The BUG() is due to the fact that the d_alias hlist is not empty when
> > > we called into d_instantiate. This is likely due to a situation where
> > > we did a successful open and instantiated the dentry and then later
> > > failed and ended up retrying. At that point, we try the open again and
> > > get back -ENOENT, and try to d_drop/d_add it.
> > > 
> > > The problem is that d_drop'ing a positive dentry is not sufficient to
> > > "clear" it for adding it back into the cache. That just makes it unfindable
> > > in the hash tables, but doesn't unhitch it from the inode.
> > > 
> > > Switch to using the helper we already have for turning positive dentries
> > > into negative ones.
> > > 
> > > Cc: Neil Brown <neilb@suse.de>
> > > Reported-by: Christoph Hellwig <hch@infradead.org>
> > > Signed-off-by: Jeff Layton <jlayton@primarydata.com>
> > > ---
> > >  fs/nfs/dir.c      | 3 ++-
> > >  fs/nfs/internal.h | 1 +
> > >  fs/nfs/nfs4proc.c | 3 +--
> > >  3 files changed, 4 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
> > > index 36d921f0c602..3938dba859c5 100644
> > > --- a/fs/nfs/dir.c
> > > +++ b/fs/nfs/dir.c
> > > @@ -1754,11 +1754,12 @@ out_err:
> > >  }
> > >  EXPORT_SYMBOL_GPL(nfs_mkdir);
> > >  
> > > -static void nfs_dentry_handle_enoent(struct dentry *dentry)
> > > +void nfs_dentry_handle_enoent(struct dentry *dentry)
> > >  {
> > >  	if (dentry->d_inode != NULL && !d_unhashed(dentry))
> > >  		d_delete(dentry);
> > >  }
> > > +EXPORT_SYMBOL_GPL(nfs_dentry_handle_enoent);
> > >  
> > >  int nfs_rmdir(struct inode *dir, struct dentry *dentry)
> > >  {
> > > diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h
> > > index 9056622d2230..8d85a57ae499 100644
> > > --- a/fs/nfs/internal.h
> > > +++ b/fs/nfs/internal.h
> > > @@ -326,6 +326,7 @@ extern unsigned long nfs_access_cache_scan(struct shrinker *shrink,
> > >  struct dentry *nfs_lookup(struct inode *, struct dentry *, unsigned int);
> > >  int nfs_create(struct inode *, struct dentry *, umode_t, bool);
> > >  int nfs_mkdir(struct inode *, struct dentry *, umode_t);
> > > +void nfs_dentry_handle_enoent(struct dentry *);
> > >  int nfs_rmdir(struct inode *, struct dentry *);
> > >  int nfs_unlink(struct inode *, struct dentry *);
> > >  int nfs_symlink(struct inode *, struct dentry *, const char *);
> > > diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> > > index 7dd8aca31c29..84ee3fb9e410 100644
> > > --- a/fs/nfs/nfs4proc.c
> > > +++ b/fs/nfs/nfs4proc.c
> > > @@ -2226,8 +2226,7 @@ static int _nfs4_open_and_get_state(struct nfs4_opendata *opendata,
> > >  	ret = _nfs4_proc_open(opendata);
> > >  	if (ret != 0) {
> > >  		if (ret == -ENOENT) {
> > > -			d_drop(opendata->dentry);
> > > -			d_add(opendata->dentry, NULL);
> > > +			nfs_dentry_handle_enoent(opendata->dentry);
> > >  			nfs_set_verifier(opendata->dentry,
> > >  					 nfs_save_change_attribute(opendata->dir->d_inode));
> > >  		}
> > 
> > 
> > The d_drop();d_add(); pattern is used a number of times in NFS, but what I
> > didn't notice before is that it is only used if ->d_inode is NULL.
> > 
> > We *could* simply change the condition to
> >    if (ret == -ENOENT && opendata->dentry->d_inode == NULL) {
> > 
> > Calling nfs_dentry_handle_enoent() is possibly a good idea, but we do want to
> > still call d_add().  The whole point of the patch which introduce that code
> > was to ensure that negative entries were cached.
> > nfs_dentry_handle_enoent()  is always enough to prepare for d_add(), so we
> > cannot just replace d_drop with that.
> > 
> > Maybe:
> > 
> >   if (ret == -ENOENT) {
> > 	struct dentry *de = opendata->dentry;
> > 	d_drop(de->d_inode);
> >         if (de->d_inode == NULL)
> > 		d_add(de, NULL);
> >         nfs_set_verifier(de, .....);
> >   }
> > 
> 
> Yeah using nfs_dentry_handle_enoent won't hash a brand new negative
> dentry. The above code though won't convert an existing positive dentry
> into a negative one. I guess though that's a rare enough situation that
> it's not worth optimizing for?
> 

I agree.  I've just posted a patch which I think is the best balance.

Thanks for your analysis and review!

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: kernel BUG in fs/dcache.c running nfs
  2014-09-10  3:57           ` NeilBrown
  2014-09-10 11:59             ` Jeff Layton
@ 2014-09-26 23:08             ` Al Viro
  2014-09-29  1:35               ` NeilBrown
  1 sibling, 1 reply; 13+ messages in thread
From: Al Viro @ 2014-09-26 23:08 UTC (permalink / raw)
  To: NeilBrown; +Cc: Jeff Layton, Christoph Hellwig, linux-nfs

On Wed, Sep 10, 2014 at 01:57:39PM +1000, NeilBrown wrote:

> The d_drop();d_add(); pattern is used a number of times in NFS, but what I
> didn't notice before is that it is only used if ->d_inode is NULL.

And the point of that "pattern" would be...?  If dentry is already negative
and hashed, it does nothing whatsoever.  If it is negative and unhashed,
d_drop() is a no-op, obviously.  And if it's positive, we get a nice
shiny oops.

Where are we doing anything of that sort, anyway?  I see a bloody odd
instance in nfs_atomic_open() and that's it.  Note that nfs_instantiate()
is different (and not necessary nice - that d_drop() in case when this
sucker is called from mkdir() is asking for races with mount()).  How
do we get a positive hashed dentry there, anyway?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: kernel BUG in fs/dcache.c running nfs
  2014-09-26 23:08             ` Al Viro
@ 2014-09-29  1:35               ` NeilBrown
  0 siblings, 0 replies; 13+ messages in thread
From: NeilBrown @ 2014-09-29  1:35 UTC (permalink / raw)
  To: Al Viro; +Cc: Jeff Layton, Christoph Hellwig, linux-nfs

[-- Attachment #1: Type: text/plain, Size: 1383 bytes --]

On Sat, 27 Sep 2014 00:08:01 +0100 Al Viro <viro@ZenIV.linux.org.uk> wrote:

> On Wed, Sep 10, 2014 at 01:57:39PM +1000, NeilBrown wrote:
> 
> > The d_drop();d_add(); pattern is used a number of times in NFS, but what I
> > didn't notice before is that it is only used if ->d_inode is NULL.
> 
> And the point of that "pattern" would be...?  If dentry is already negative
> and hashed, it does nothing whatsoever.  If it is negative and unhashed,
> d_drop() is a no-op, obviously.  And if it's positive, we get a nice
> shiny oops.
> 
> Where are we doing anything of that sort, anyway?  I see a bloody odd
> instance in nfs_atomic_open() and that's it.  Note that nfs_instantiate()
> is different (and not necessary nice - that d_drop() in case when this
> sucker is called from mkdir() is asking for races with mount()).  How
> do we get a positive hashed dentry there, anyway?

Apart from nfs_atomic_open() and  nfs_instantiate(), the main one is
in _nfs4_open_and_get_state:
		/* FIXME: Is this d_drop() ever needed? */
		d_drop(dentry);
		dentry = d_add_unique(dentry, igrab(state->inode));
   maybe you can answer the question in the comment

nfs_link also has

	d_drop(dentry);
	error = NFS_PROTO(dir)->link(inode, dir, &dentry->d_name);
	if (error == 0) {
		ihold(inode);
		d_add(dentry, inode);
	}

which is similar.

Thanks,
NeilBrown

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2014-09-29  1:35 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-08 14:45 kernel BUG in fs/dcache.c running nfs Christoph Hellwig
2014-09-09 14:59 ` Jeff Layton
2014-09-09 15:42   ` Christoph Hellwig
2014-09-09 16:12     ` Jeff Layton
2014-09-09 16:15       ` Jeff Layton
2014-09-09 17:50         ` Jeff Layton
2014-09-10  3:57           ` NeilBrown
2014-09-10 11:59             ` Jeff Layton
2014-09-11  6:20               ` NeilBrown
2014-09-26 23:08             ` Al Viro
2014-09-29  1:35               ` NeilBrown
2014-09-09 16:59   ` Trond Myklebust
2014-09-09 17:28     ` Jeff Layton

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.