All of lore.kernel.org
 help / color / mirror / Atom feed
* Thread overran stack, or stack corrupted BUG on mount
@ 2013-11-12 15:31 Weston Andros Adamson
  2013-11-12 15:55 ` Jeff Layton
  0 siblings, 1 reply; 10+ messages in thread
From: Weston Andros Adamson @ 2013-11-12 15:31 UTC (permalink / raw)
  To: linux-nfs list

I got this oops yesterday running the “test_sec_options.sh” script I recently posted as a patch to Anna’s nfs-ordeal repo (tons of mount/umount).

At this point GSSD had died (I was tracking down a fd leak).  I haven’t been able to reproduce this yet.

Any idea if I should trust the stack trace? Could this be related to the issue Jeff just posted?

-dros

BUG: unable to handle kernel paging request at ffff88017a604030
IP: [<ffffffff81063089>] __wake_up+0x22/0x4d
PGD 2651067 PUD 0 
Thread overran stack, or stack corrupted
Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
Modules linked in: nfsv4 cts rpcsec_gss_krb5 nfsv3 nfs fscache crc32c_intel aesni_intel aes_x86_64 glue_helper lrw gf128mul ppdev ablk_helper cryptd serio_raw i2c_piix4 i2c_core e1000 nfsd parport_pc parport shpchp auth_rpcgss oid_registry exportfs nfs_acl lockd floppy freq_table sunrpc autofs4 mptspi scsi_transport_spi mptscsih mptbase ata_generic
CPU: 0 PID: 10547 Comm: mount.nfs Not tainted 3.12.0-rc3-branch-dros_testing+ #1
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/31/2013
task: ffff8800798f2100 ti: ffff88007a604000 task.ti: ffff88007a604000
RIP: 0010:[<ffffffff81063089>]  [<ffffffff81063089>] __wake_up+0x22/0x4d
RSP: 0018:ffff88007a604028  EFLAGS: 00010092
RAX: 0000000000000296 RBX: ffffffffa006a980 RCX: 000000009a519a50
RDX: 000000009a509a50 RSI: 000000000000038a RDI: ffffffffa006a980
RBP: ffff88017a604058 R08: 0000000000000003 R09: 0000000000000001
R10: ffff88006d41d7c0 R11: ffff88007f20b000 R12: ffff88007a6058e0
R13: ffff8800645d8018 R14: ffff88007a6058f8 R15: ffff8800798f2100
FS:  00007fb2765b3880(0000) GS:ffff88007f200000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffff88017a604030 CR3: 000000007a6ba000 CR4: 00000000001407f0
Stack:
 ffff88007a604038 0000000000000000 ffff880000000001 0000000000000003
 ffff88006453a0d0 ffff88007a6058e0 ffff88007a604078 ffffffffa0045de2
 ffff88006453dbe0 ffff88006453dbe0 ffff88007a604098 ffffffffa0045d27
Call Trace:
 [<ffffffffa0045de2>] ? rpc_release_client+0x4a/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
 [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
 [<ffffffffa0045f39>] ? rpc_shutdown_client+0x107/0x116 [sunrpc]
 [<ffffffffa02a6456>] ? __fscache_cookie_put+0x43/0x4f [fscache]
 [<ffffffffa02a65ca>] ? __fscache_relinquish_cookie+0x168/0x16d [fscache]
 [<ffffffffa02bdc2b>] ? nfs_free_client+0x4c/0xaf [nfs]
 [<ffffffffa0340e4a>] ? nfs4_free_client+0x97/0x9b [nfsv4]
 [<ffffffffa02bcfd9>] ? nfs_put_client+0xe8/0xed [nfs]
 [<ffffffffa0341126>] ? nfs4_init_client+0x22e/0x29d [nfsv4]
 [<ffffffffa02bc95f>] ? nfs_probe_fsinfo+0x2c7/0x2c7 [nfs]
 [<ffffffffa02bd1af>] ? nfs_get_client+0x8a/0x2bf [nfs]
 [<ffffffffa02bd37f>] ? nfs_get_client+0x25a/0x2bf [nfs]
 [<ffffffffa034059c>] ? nfs4_set_client+0x9f/0xf1 [nfsv4]
 [<ffffffffa004e917>] ? __rpc_init_priority_wait_queue+0x98/0xcf [sunrpc]
 [<ffffffffa0341999>] ? nfs4_create_server+0xfe/0x264 [nfsv4]
 [<ffffffffa033ac59>] ? nfs4_remote_mount+0x2f/0x57 [nfsv4]
 [<ffffffff8112a846>] ? mount_fs+0x69/0x157
 [<ffffffff810fb79b>] ? __alloc_percpu+0x10/0x12
 [<ffffffff8113fcbd>] ? vfs_kern_mount+0x62/0xd9
 [<ffffffffa033ac02>] ? nfs_do_root_mount+0x8c/0xb4 [nfsv4]
 [<ffffffffa033aea9>] ? nfs4_try_mount+0x60/0xbb [nfsv4]
 [<ffffffffa02c80eb>] ? nfs_fs_mount+0x88f/0x97a [nfs]
 [<ffffffffa02c8620>] ? nfs_clone_super+0x6b/0x6b [nfs]
 [<ffffffffa02c59ce>] ? nfs_set_super+0x53/0x53 [nfs]
 [<ffffffff8112a846>] ? mount_fs+0x69/0x157
 [<ffffffff810fb79b>] ? __alloc_percpu+0x10/0x12
 [<ffffffff8113fcbd>] ? vfs_kern_mount+0x62/0xd9
 [<ffffffff81141fa6>] ? do_mount+0x6ce/0x871
 [<ffffffff81141833>] ? copy_mount_options+0xc2/0x12f
 [<ffffffff811421ce>] ? SyS_mount+0x85/0xbe
 [<ffffffff814a4292>] ? system_call_fastpath+0x16/0x1b
Code: 89 e5 e8 98 ff ff ff 5d c3 0f 1f 44 00 00 55 48 89 e5 41 54 53 48 89 fb 48 83 ec 20 89 55 e0 89 75 e8 48 89 4d d8 e8 0c 99 43 00 <4c> 8b 45 d8 48 89 df 8b 55 e0 49 89 c4 31 c9 8b 75 e8 e8 b4 d0 
RIP  [<ffffffff81063089>] __wake_up+0x22/0x4d
 RSP <ffff88007a604028>
CR2: ffff88017a604030
---[ end trace 1122f3f8cf98e4c2 ]---


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Thread overran stack, or stack corrupted BUG on mount
  2013-11-12 15:31 Thread overran stack, or stack corrupted BUG on mount Weston Andros Adamson
@ 2013-11-12 15:55 ` Jeff Layton
  2013-11-12 16:20   ` Jeff Layton
  0 siblings, 1 reply; 10+ messages in thread
From: Jeff Layton @ 2013-11-12 15:55 UTC (permalink / raw)
  To: Weston Andros Adamson; +Cc: linux-nfs list

On Tue, 12 Nov 2013 15:31:34 +0000
Weston Andros Adamson <dros@netapp.com> wrote:

> I got this oops yesterday running the “test_sec_options.sh” script I recently posted as a patch to Anna’s nfs-ordeal repo (tons of mount/umount).
> 
> At this point GSSD had died (I was tracking down a fd leak).  I haven’t been able to reproduce this yet.
> 
> Any idea if I should trust the stack trace? Could this be related to the issue Jeff just posted?
> 
> -dros
> 
> BUG: unable to handle kernel paging request at ffff88017a604030
> IP: [<ffffffff81063089>] __wake_up+0x22/0x4d
> PGD 2651067 PUD 0 
> Thread overran stack, or stack corrupted
> Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
> Modules linked in: nfsv4 cts rpcsec_gss_krb5 nfsv3 nfs fscache crc32c_intel aesni_intel aes_x86_64 glue_helper lrw gf128mul ppdev ablk_helper cryptd serio_raw i2c_piix4 i2c_core e1000 nfsd parport_pc parport shpchp auth_rpcgss oid_registry exportfs nfs_acl lockd floppy freq_table sunrpc autofs4 mptspi scsi_transport_spi mptscsih mptbase ata_generic
> CPU: 0 PID: 10547 Comm: mount.nfs Not tainted 3.12.0-rc3-branch-dros_testing+ #1
> Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/31/2013
> task: ffff8800798f2100 ti: ffff88007a604000 task.ti: ffff88007a604000
> RIP: 0010:[<ffffffff81063089>]  [<ffffffff81063089>] __wake_up+0x22/0x4d
> RSP: 0018:ffff88007a604028  EFLAGS: 00010092
> RAX: 0000000000000296 RBX: ffffffffa006a980 RCX: 000000009a519a50
> RDX: 000000009a509a50 RSI: 000000000000038a RDI: ffffffffa006a980
> RBP: ffff88017a604058 R08: 0000000000000003 R09: 0000000000000001
> R10: ffff88006d41d7c0 R11: ffff88007f20b000 R12: ffff88007a6058e0
> R13: ffff8800645d8018 R14: ffff88007a6058f8 R15: ffff8800798f2100
> FS:  00007fb2765b3880(0000) GS:ffff88007f200000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: ffff88017a604030 CR3: 000000007a6ba000 CR4: 00000000001407f0
> Stack:
>  ffff88007a604038 0000000000000000 ffff880000000001 0000000000000003
>  ffff88006453a0d0 ffff88007a6058e0 ffff88007a604078 ffffffffa0045de2
>  ffff88006453dbe0 ffff88006453dbe0 ffff88007a604098 ffffffffa0045d27
> Call Trace:
>  [<ffffffffa0045de2>] ? rpc_release_client+0x4a/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>  [<ffffffffa0045f39>] ? rpc_shutdown_client+0x107/0x116 [sunrpc]
>  [<ffffffffa02a6456>] ? __fscache_cookie_put+0x43/0x4f [fscache]
>  [<ffffffffa02a65ca>] ? __fscache_relinquish_cookie+0x168/0x16d [fscache]
>  [<ffffffffa02bdc2b>] ? nfs_free_client+0x4c/0xaf [nfs]
>  [<ffffffffa0340e4a>] ? nfs4_free_client+0x97/0x9b [nfsv4]
>  [<ffffffffa02bcfd9>] ? nfs_put_client+0xe8/0xed [nfs]
>  [<ffffffffa0341126>] ? nfs4_init_client+0x22e/0x29d [nfsv4]
>  [<ffffffffa02bc95f>] ? nfs_probe_fsinfo+0x2c7/0x2c7 [nfs]
>  [<ffffffffa02bd1af>] ? nfs_get_client+0x8a/0x2bf [nfs]
>  [<ffffffffa02bd37f>] ? nfs_get_client+0x25a/0x2bf [nfs]
>  [<ffffffffa034059c>] ? nfs4_set_client+0x9f/0xf1 [nfsv4]
>  [<ffffffffa004e917>] ? __rpc_init_priority_wait_queue+0x98/0xcf [sunrpc]
>  [<ffffffffa0341999>] ? nfs4_create_server+0xfe/0x264 [nfsv4]
>  [<ffffffffa033ac59>] ? nfs4_remote_mount+0x2f/0x57 [nfsv4]
>  [<ffffffff8112a846>] ? mount_fs+0x69/0x157
>  [<ffffffff810fb79b>] ? __alloc_percpu+0x10/0x12
>  [<ffffffff8113fcbd>] ? vfs_kern_mount+0x62/0xd9
>  [<ffffffffa033ac02>] ? nfs_do_root_mount+0x8c/0xb4 [nfsv4]
>  [<ffffffffa033aea9>] ? nfs4_try_mount+0x60/0xbb [nfsv4]
>  [<ffffffffa02c80eb>] ? nfs_fs_mount+0x88f/0x97a [nfs]
>  [<ffffffffa02c8620>] ? nfs_clone_super+0x6b/0x6b [nfs]
>  [<ffffffffa02c59ce>] ? nfs_set_super+0x53/0x53 [nfs]
>  [<ffffffff8112a846>] ? mount_fs+0x69/0x157
>  [<ffffffff810fb79b>] ? __alloc_percpu+0x10/0x12
>  [<ffffffff8113fcbd>] ? vfs_kern_mount+0x62/0xd9
>  [<ffffffff81141fa6>] ? do_mount+0x6ce/0x871
>  [<ffffffff81141833>] ? copy_mount_options+0xc2/0x12f
>  [<ffffffff811421ce>] ? SyS_mount+0x85/0xbe
>  [<ffffffff814a4292>] ? system_call_fastpath+0x16/0x1b
> Code: 89 e5 e8 98 ff ff ff 5d c3 0f 1f 44 00 00 55 48 89 e5 41 54 53 48 89 fb 48 83 ec 20 89 55 e0 89 75 e8 48 89 4d d8 e8 0c 99 43 00 <4c> 8b 45 d8 48 89 df 8b 55 e0 49 89 c4 31 c9 8b 75 e8 e8 b4 d0 
> RIP  [<ffffffff81063089>] __wake_up+0x22/0x4d
>  RSP <ffff88007a604028>
> CR2: ffff88017a604030
> ---[ end trace 1122f3f8cf98e4c2 ]---
> 

Yep, I think this is the same problem I reported earlier. I ran the
reproducer with rpc_debug turned up and ended up seeing a very similar
stack trace. Basically the server is returning NFS4ERR_CLID_IN_USE but
the client keeps retrying the call over and over.

I suspect that leads to some sort of recursion, but I haven't quite
spotted it yet.

-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Thread overran stack, or stack corrupted BUG on mount
  2013-11-12 15:55 ` Jeff Layton
@ 2013-11-12 16:20   ` Jeff Layton
  2013-11-12 16:23     ` Chuck Lever
  2013-11-12 16:57     ` J. Bruce Fields
  0 siblings, 2 replies; 10+ messages in thread
From: Jeff Layton @ 2013-11-12 16:20 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Weston Andros Adamson, linux-nfs list, chuck.lever

On Tue, 12 Nov 2013 10:55:39 -0500
Jeff Layton <jlayton@redhat.com> wrote:

> On Tue, 12 Nov 2013 15:31:34 +0000
> Weston Andros Adamson <dros@netapp.com> wrote:
> 
> > I got this oops yesterday running the “test_sec_options.sh” script I recently posted as a patch to Anna’s nfs-ordeal repo (tons of mount/umount).
> > 
> > At this point GSSD had died (I was tracking down a fd leak).  I haven’t been able to reproduce this yet.
> > 
> > Any idea if I should trust the stack trace? Could this be related to the issue Jeff just posted?
> > 
> > -dros
> > 
> > BUG: unable to handle kernel paging request at ffff88017a604030
> > IP: [<ffffffff81063089>] __wake_up+0x22/0x4d
> > PGD 2651067 PUD 0 
> > Thread overran stack, or stack corrupted
> > Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
> > Modules linked in: nfsv4 cts rpcsec_gss_krb5 nfsv3 nfs fscache crc32c_intel aesni_intel aes_x86_64 glue_helper lrw gf128mul ppdev ablk_helper cryptd serio_raw i2c_piix4 i2c_core e1000 nfsd parport_pc parport shpchp auth_rpcgss oid_registry exportfs nfs_acl lockd floppy freq_table sunrpc autofs4 mptspi scsi_transport_spi mptscsih mptbase ata_generic
> > CPU: 0 PID: 10547 Comm: mount.nfs Not tainted 3.12.0-rc3-branch-dros_testing+ #1
> > Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/31/2013
> > task: ffff8800798f2100 ti: ffff88007a604000 task.ti: ffff88007a604000
> > RIP: 0010:[<ffffffff81063089>]  [<ffffffff81063089>] __wake_up+0x22/0x4d
> > RSP: 0018:ffff88007a604028  EFLAGS: 00010092
> > RAX: 0000000000000296 RBX: ffffffffa006a980 RCX: 000000009a519a50
> > RDX: 000000009a509a50 RSI: 000000000000038a RDI: ffffffffa006a980
> > RBP: ffff88017a604058 R08: 0000000000000003 R09: 0000000000000001
> > R10: ffff88006d41d7c0 R11: ffff88007f20b000 R12: ffff88007a6058e0
> > R13: ffff8800645d8018 R14: ffff88007a6058f8 R15: ffff8800798f2100
> > FS:  00007fb2765b3880(0000) GS:ffff88007f200000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: ffff88017a604030 CR3: 000000007a6ba000 CR4: 00000000001407f0
> > Stack:
> >  ffff88007a604038 0000000000000000 ffff880000000001 0000000000000003
> >  ffff88006453a0d0 ffff88007a6058e0 ffff88007a604078 ffffffffa0045de2
> >  ffff88006453dbe0 ffff88006453dbe0 ffff88007a604098 ffffffffa0045d27
> > Call Trace:
> >  [<ffffffffa0045de2>] ? rpc_release_client+0x4a/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> >  [<ffffffffa0045f39>] ? rpc_shutdown_client+0x107/0x116 [sunrpc]
> >  [<ffffffffa02a6456>] ? __fscache_cookie_put+0x43/0x4f [fscache]
> >  [<ffffffffa02a65ca>] ? __fscache_relinquish_cookie+0x168/0x16d [fscache]
> >  [<ffffffffa02bdc2b>] ? nfs_free_client+0x4c/0xaf [nfs]
> >  [<ffffffffa0340e4a>] ? nfs4_free_client+0x97/0x9b [nfsv4]
> >  [<ffffffffa02bcfd9>] ? nfs_put_client+0xe8/0xed [nfs]
> >  [<ffffffffa0341126>] ? nfs4_init_client+0x22e/0x29d [nfsv4]
> >  [<ffffffffa02bc95f>] ? nfs_probe_fsinfo+0x2c7/0x2c7 [nfs]
> >  [<ffffffffa02bd1af>] ? nfs_get_client+0x8a/0x2bf [nfs]
> >  [<ffffffffa02bd37f>] ? nfs_get_client+0x25a/0x2bf [nfs]
> >  [<ffffffffa034059c>] ? nfs4_set_client+0x9f/0xf1 [nfsv4]
> >  [<ffffffffa004e917>] ? __rpc_init_priority_wait_queue+0x98/0xcf [sunrpc]
> >  [<ffffffffa0341999>] ? nfs4_create_server+0xfe/0x264 [nfsv4]
> >  [<ffffffffa033ac59>] ? nfs4_remote_mount+0x2f/0x57 [nfsv4]
> >  [<ffffffff8112a846>] ? mount_fs+0x69/0x157
> >  [<ffffffff810fb79b>] ? __alloc_percpu+0x10/0x12
> >  [<ffffffff8113fcbd>] ? vfs_kern_mount+0x62/0xd9
> >  [<ffffffffa033ac02>] ? nfs_do_root_mount+0x8c/0xb4 [nfsv4]
> >  [<ffffffffa033aea9>] ? nfs4_try_mount+0x60/0xbb [nfsv4]
> >  [<ffffffffa02c80eb>] ? nfs_fs_mount+0x88f/0x97a [nfs]
> >  [<ffffffffa02c8620>] ? nfs_clone_super+0x6b/0x6b [nfs]
> >  [<ffffffffa02c59ce>] ? nfs_set_super+0x53/0x53 [nfs]
> >  [<ffffffff8112a846>] ? mount_fs+0x69/0x157
> >  [<ffffffff810fb79b>] ? __alloc_percpu+0x10/0x12
> >  [<ffffffff8113fcbd>] ? vfs_kern_mount+0x62/0xd9
> >  [<ffffffff81141fa6>] ? do_mount+0x6ce/0x871
> >  [<ffffffff81141833>] ? copy_mount_options+0xc2/0x12f
> >  [<ffffffff811421ce>] ? SyS_mount+0x85/0xbe
> >  [<ffffffff814a4292>] ? system_call_fastpath+0x16/0x1b
> > Code: 89 e5 e8 98 ff ff ff 5d c3 0f 1f 44 00 00 55 48 89 e5 41 54 53 48 89 fb 48 83 ec 20 89 55 e0 89 75 e8 48 89 4d d8 e8 0c 99 43 00 <4c> 8b 45 d8 48 89 df 8b 55 e0 49 89 c4 31 c9 8b 75 e8 e8 b4 d0 
> > RIP  [<ffffffff81063089>] __wake_up+0x22/0x4d
> >  RSP <ffff88007a604028>
> > CR2: ffff88017a604030
> > ---[ end trace 1122f3f8cf98e4c2 ]---
> > 
> 
> Yep, I think this is the same problem I reported earlier. I ran the
> reproducer with rpc_debug turned up and ended up seeing a very similar
> stack trace. Basically the server is returning NFS4ERR_CLID_IN_USE but
> the client keeps retrying the call over and over.
> 
> I suspect that leads to some sort of recursion, but I haven't quite
> spotted it yet.
> 

(cc'ing Chuck since I think the problem is in the new detect_trunking code)

Ok, I think I see the problem. The looping comes from this block in
nfs4_discover_server_trunking:

-----------------[snip]-----------------
        case -NFS4ERR_CLID_INUSE:
        case -NFS4ERR_WRONGSEC:
                clnt = rpc_clone_client_set_auth(clnt, RPC_AUTH_UNIX);
                if (IS_ERR(clnt)) {
                        status = PTR_ERR(clnt);
                        break;
                }
                /* Note: this is safe because we haven't yet marked the
                 * client as ready, so we are the only user of
                 * clp->cl_rpcclient
                 */
                clnt = xchg(&clp->cl_rpcclient, clnt);
                rpc_shutdown_client(clnt);
                clnt = clp->cl_rpcclient;
                goto again;
-----------------[snip]-----------------

...so in the case of the reproducer, we get back -NFS4ERR_CLID_IN_USE,
at that point we call rpc_clone_client_set_auth(), which creates a new
rpc_clnt, but it's created as a child of the original.

When rpc_shutdown_client is called, the original clnt is not destroyed
because the child still holds a reference to it. So, we go and try the
call again and it fails with the same error over and over again, and we
end up with a long chain of rpc_clnt's.

How that ends up smashing the stack, I'm not sure though. I'm also not
sure of the remedy. It seems like we might ought to have some upper
bound on the number of SETCLIENTID attempts?

-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Thread overran stack, or stack corrupted BUG on mount
  2013-11-12 16:20   ` Jeff Layton
@ 2013-11-12 16:23     ` Chuck Lever
  2013-11-12 17:30       ` Myklebust, Trond
  2013-11-12 16:57     ` J. Bruce Fields
  1 sibling, 1 reply; 10+ messages in thread
From: Chuck Lever @ 2013-11-12 16:23 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Weston Andros Adamson, linux-nfs list


On Nov 12, 2013, at 11:20 AM, Jeff Layton <jlayton@redhat.com> wrote:

> On Tue, 12 Nov 2013 10:55:39 -0500
> Jeff Layton <jlayton@redhat.com> wrote:
> 
>> On Tue, 12 Nov 2013 15:31:34 +0000
>> Weston Andros Adamson <dros@netapp.com> wrote:
>> 
>>> I got this oops yesterday running the “test_sec_options.sh” script I recently posted as a patch to Anna’s nfs-ordeal repo (tons of mount/umount).
>>> 
>>> At this point GSSD had died (I was tracking down a fd leak).  I haven’t been able to reproduce this yet.
>>> 
>>> Any idea if I should trust the stack trace? Could this be related to the issue Jeff just posted?
>>> 
>>> -dros
>>> 
>>> BUG: unable to handle kernel paging request at ffff88017a604030
>>> IP: [<ffffffff81063089>] __wake_up+0x22/0x4d
>>> PGD 2651067 PUD 0 
>>> Thread overran stack, or stack corrupted
>>> Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
>>> Modules linked in: nfsv4 cts rpcsec_gss_krb5 nfsv3 nfs fscache crc32c_intel aesni_intel aes_x86_64 glue_helper lrw gf128mul ppdev ablk_helper cryptd serio_raw i2c_piix4 i2c_core e1000 nfsd parport_pc parport shpchp auth_rpcgss oid_registry exportfs nfs_acl lockd floppy freq_table sunrpc autofs4 mptspi scsi_transport_spi mptscsih mptbase ata_generic
>>> CPU: 0 PID: 10547 Comm: mount.nfs Not tainted 3.12.0-rc3-branch-dros_testing+ #1
>>> Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/31/2013
>>> task: ffff8800798f2100 ti: ffff88007a604000 task.ti: ffff88007a604000
>>> RIP: 0010:[<ffffffff81063089>]  [<ffffffff81063089>] __wake_up+0x22/0x4d
>>> RSP: 0018:ffff88007a604028  EFLAGS: 00010092
>>> RAX: 0000000000000296 RBX: ffffffffa006a980 RCX: 000000009a519a50
>>> RDX: 000000009a509a50 RSI: 000000000000038a RDI: ffffffffa006a980
>>> RBP: ffff88017a604058 R08: 0000000000000003 R09: 0000000000000001
>>> R10: ffff88006d41d7c0 R11: ffff88007f20b000 R12: ffff88007a6058e0
>>> R13: ffff8800645d8018 R14: ffff88007a6058f8 R15: ffff8800798f2100
>>> FS:  00007fb2765b3880(0000) GS:ffff88007f200000(0000) knlGS:0000000000000000
>>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> CR2: ffff88017a604030 CR3: 000000007a6ba000 CR4: 00000000001407f0
>>> Stack:
>>> ffff88007a604038 0000000000000000 ffff880000000001 0000000000000003
>>> ffff88006453a0d0 ffff88007a6058e0 ffff88007a604078 ffffffffa0045de2
>>> ffff88006453dbe0 ffff88006453dbe0 ffff88007a604098 ffffffffa0045d27
>>> Call Trace:
>>> [<ffffffffa0045de2>] ? rpc_release_client+0x4a/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
>>> [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
>>> [<ffffffffa0045f39>] ? rpc_shutdown_client+0x107/0x116 [sunrpc]
>>> [<ffffffffa02a6456>] ? __fscache_cookie_put+0x43/0x4f [fscache]
>>> [<ffffffffa02a65ca>] ? __fscache_relinquish_cookie+0x168/0x16d [fscache]
>>> [<ffffffffa02bdc2b>] ? nfs_free_client+0x4c/0xaf [nfs]
>>> [<ffffffffa0340e4a>] ? nfs4_free_client+0x97/0x9b [nfsv4]
>>> [<ffffffffa02bcfd9>] ? nfs_put_client+0xe8/0xed [nfs]
>>> [<ffffffffa0341126>] ? nfs4_init_client+0x22e/0x29d [nfsv4]
>>> [<ffffffffa02bc95f>] ? nfs_probe_fsinfo+0x2c7/0x2c7 [nfs]
>>> [<ffffffffa02bd1af>] ? nfs_get_client+0x8a/0x2bf [nfs]
>>> [<ffffffffa02bd37f>] ? nfs_get_client+0x25a/0x2bf [nfs]
>>> [<ffffffffa034059c>] ? nfs4_set_client+0x9f/0xf1 [nfsv4]
>>> [<ffffffffa004e917>] ? __rpc_init_priority_wait_queue+0x98/0xcf [sunrpc]
>>> [<ffffffffa0341999>] ? nfs4_create_server+0xfe/0x264 [nfsv4]
>>> [<ffffffffa033ac59>] ? nfs4_remote_mount+0x2f/0x57 [nfsv4]
>>> [<ffffffff8112a846>] ? mount_fs+0x69/0x157
>>> [<ffffffff810fb79b>] ? __alloc_percpu+0x10/0x12
>>> [<ffffffff8113fcbd>] ? vfs_kern_mount+0x62/0xd9
>>> [<ffffffffa033ac02>] ? nfs_do_root_mount+0x8c/0xb4 [nfsv4]
>>> [<ffffffffa033aea9>] ? nfs4_try_mount+0x60/0xbb [nfsv4]
>>> [<ffffffffa02c80eb>] ? nfs_fs_mount+0x88f/0x97a [nfs]
>>> [<ffffffffa02c8620>] ? nfs_clone_super+0x6b/0x6b [nfs]
>>> [<ffffffffa02c59ce>] ? nfs_set_super+0x53/0x53 [nfs]
>>> [<ffffffff8112a846>] ? mount_fs+0x69/0x157
>>> [<ffffffff810fb79b>] ? __alloc_percpu+0x10/0x12
>>> [<ffffffff8113fcbd>] ? vfs_kern_mount+0x62/0xd9
>>> [<ffffffff81141fa6>] ? do_mount+0x6ce/0x871
>>> [<ffffffff81141833>] ? copy_mount_options+0xc2/0x12f
>>> [<ffffffff811421ce>] ? SyS_mount+0x85/0xbe
>>> [<ffffffff814a4292>] ? system_call_fastpath+0x16/0x1b
>>> Code: 89 e5 e8 98 ff ff ff 5d c3 0f 1f 44 00 00 55 48 89 e5 41 54 53 48 89 fb 48 83 ec 20 89 55 e0 89 75 e8 48 89 4d d8 e8 0c 99 43 00 <4c> 8b 45 d8 48 89 df 8b 55 e0 49 89 c4 31 c9 8b 75 e8 e8 b4 d0 
>>> RIP  [<ffffffff81063089>] __wake_up+0x22/0x4d
>>> RSP <ffff88007a604028>
>>> CR2: ffff88017a604030
>>> ---[ end trace 1122f3f8cf98e4c2 ]---
>>> 
>> 
>> Yep, I think this is the same problem I reported earlier. I ran the
>> reproducer with rpc_debug turned up and ended up seeing a very similar
>> stack trace. Basically the server is returning NFS4ERR_CLID_IN_USE but
>> the client keeps retrying the call over and over.
>> 
>> I suspect that leads to some sort of recursion, but I haven't quite
>> spotted it yet.
>> 
> 
> (cc'ing Chuck since I think the problem is in the new detect_trunking code)
> 
> Ok, I think I see the problem. The looping comes from this block in
> nfs4_discover_server_trunking:
> 
> -----------------[snip]-----------------
>        case -NFS4ERR_CLID_INUSE:
>        case -NFS4ERR_WRONGSEC:
>                clnt = rpc_clone_client_set_auth(clnt, RPC_AUTH_UNIX);
>                if (IS_ERR(clnt)) {
>                        status = PTR_ERR(clnt);
>                        break;
>                }
>                /* Note: this is safe because we haven't yet marked the
>                 * client as ready, so we are the only user of
>                 * clp->cl_rpcclient
>                 */
>                clnt = xchg(&clp->cl_rpcclient, clnt);
>                rpc_shutdown_client(clnt);
>                clnt = clp->cl_rpcclient;
>                goto again;
> -----------------[snip]-----------------
> 
> ...so in the case of the reproducer, we get back -NFS4ERR_CLID_IN_USE,
> at that point we call rpc_clone_client_set_auth(), which creates a new
> rpc_clnt, but it's created as a child of the original.
> 
> When rpc_shutdown_client is called, the original clnt is not destroyed
> because the child still holds a reference to it. So, we go and try the
> call again and it fails with the same error over and over again, and we
> end up with a long chain of rpc_clnt's.
> 
> How that ends up smashing the stack, I'm not sure though. I'm also not
> sure of the remedy. It seems like we might ought to have some upper
> bound on the number of SETCLIENTID attempts?

CLID_INUSE is supposed to be a permanent error now.  I think one retry, if any, is all that is appropriate.

-- 
Chuck Lever
chuck[dot]lever[at]oracle[dot]com





^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Thread overran stack, or stack corrupted BUG on mount
  2013-11-12 16:20   ` Jeff Layton
  2013-11-12 16:23     ` Chuck Lever
@ 2013-11-12 16:57     ` J. Bruce Fields
  2013-11-12 17:50       ` Myklebust, Trond
  1 sibling, 1 reply; 10+ messages in thread
From: J. Bruce Fields @ 2013-11-12 16:57 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Weston Andros Adamson, linux-nfs list, chuck.lever

On Tue, Nov 12, 2013 at 11:20:21AM -0500, Jeff Layton wrote:
> On Tue, 12 Nov 2013 10:55:39 -0500
> Jeff Layton <jlayton@redhat.com> wrote:
> 
> > On Tue, 12 Nov 2013 15:31:34 +0000
> > Weston Andros Adamson <dros@netapp.com> wrote:
> > 
> > > I got this oops yesterday running the “test_sec_options.sh” script I recently posted as a patch to Anna’s nfs-ordeal repo (tons of mount/umount).
> > > 
> > > At this point GSSD had died (I was tracking down a fd leak).  I haven’t been able to reproduce this yet.
> > > 
> > > Any idea if I should trust the stack trace? Could this be related to the issue Jeff just posted?
> > > 
> > > -dros
> > > 
> > > BUG: unable to handle kernel paging request at ffff88017a604030
> > > IP: [<ffffffff81063089>] __wake_up+0x22/0x4d
> > > PGD 2651067 PUD 0 
> > > Thread overran stack, or stack corrupted
> > > Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
> > > Modules linked in: nfsv4 cts rpcsec_gss_krb5 nfsv3 nfs fscache crc32c_intel aesni_intel aes_x86_64 glue_helper lrw gf128mul ppdev ablk_helper cryptd serio_raw i2c_piix4 i2c_core e1000 nfsd parport_pc parport shpchp auth_rpcgss oid_registry exportfs nfs_acl lockd floppy freq_table sunrpc autofs4 mptspi scsi_transport_spi mptscsih mptbase ata_generic
> > > CPU: 0 PID: 10547 Comm: mount.nfs Not tainted 3.12.0-rc3-branch-dros_testing+ #1
> > > Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/31/2013
> > > task: ffff8800798f2100 ti: ffff88007a604000 task.ti: ffff88007a604000
> > > RIP: 0010:[<ffffffff81063089>]  [<ffffffff81063089>] __wake_up+0x22/0x4d
> > > RSP: 0018:ffff88007a604028  EFLAGS: 00010092
> > > RAX: 0000000000000296 RBX: ffffffffa006a980 RCX: 000000009a519a50
> > > RDX: 000000009a509a50 RSI: 000000000000038a RDI: ffffffffa006a980
> > > RBP: ffff88017a604058 R08: 0000000000000003 R09: 0000000000000001
> > > R10: ffff88006d41d7c0 R11: ffff88007f20b000 R12: ffff88007a6058e0
> > > R13: ffff8800645d8018 R14: ffff88007a6058f8 R15: ffff8800798f2100
> > > FS:  00007fb2765b3880(0000) GS:ffff88007f200000(0000) knlGS:0000000000000000
> > > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > CR2: ffff88017a604030 CR3: 000000007a6ba000 CR4: 00000000001407f0
> > > Stack:
> > >  ffff88007a604038 0000000000000000 ffff880000000001 0000000000000003
> > >  ffff88006453a0d0 ffff88007a6058e0 ffff88007a604078 ffffffffa0045de2
> > >  ffff88006453dbe0 ffff88006453dbe0 ffff88007a604098 ffffffffa0045d27
> > > Call Trace:
> > >  [<ffffffffa0045de2>] ? rpc_release_client+0x4a/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045d27>] ? rpc_free_client+0x56/0xc7 [sunrpc]
> > >  [<ffffffffa0045e00>] ? rpc_release_client+0x68/0x9a [sunrpc]
> > >  [<ffffffffa0045f39>] ? rpc_shutdown_client+0x107/0x116 [sunrpc]
> > >  [<ffffffffa02a6456>] ? __fscache_cookie_put+0x43/0x4f [fscache]
> > >  [<ffffffffa02a65ca>] ? __fscache_relinquish_cookie+0x168/0x16d [fscache]
> > >  [<ffffffffa02bdc2b>] ? nfs_free_client+0x4c/0xaf [nfs]
> > >  [<ffffffffa0340e4a>] ? nfs4_free_client+0x97/0x9b [nfsv4]
> > >  [<ffffffffa02bcfd9>] ? nfs_put_client+0xe8/0xed [nfs]
> > >  [<ffffffffa0341126>] ? nfs4_init_client+0x22e/0x29d [nfsv4]
> > >  [<ffffffffa02bc95f>] ? nfs_probe_fsinfo+0x2c7/0x2c7 [nfs]
> > >  [<ffffffffa02bd1af>] ? nfs_get_client+0x8a/0x2bf [nfs]
> > >  [<ffffffffa02bd37f>] ? nfs_get_client+0x25a/0x2bf [nfs]
> > >  [<ffffffffa034059c>] ? nfs4_set_client+0x9f/0xf1 [nfsv4]
> > >  [<ffffffffa004e917>] ? __rpc_init_priority_wait_queue+0x98/0xcf [sunrpc]
> > >  [<ffffffffa0341999>] ? nfs4_create_server+0xfe/0x264 [nfsv4]
> > >  [<ffffffffa033ac59>] ? nfs4_remote_mount+0x2f/0x57 [nfsv4]
> > >  [<ffffffff8112a846>] ? mount_fs+0x69/0x157
> > >  [<ffffffff810fb79b>] ? __alloc_percpu+0x10/0x12
> > >  [<ffffffff8113fcbd>] ? vfs_kern_mount+0x62/0xd9
> > >  [<ffffffffa033ac02>] ? nfs_do_root_mount+0x8c/0xb4 [nfsv4]
> > >  [<ffffffffa033aea9>] ? nfs4_try_mount+0x60/0xbb [nfsv4]
> > >  [<ffffffffa02c80eb>] ? nfs_fs_mount+0x88f/0x97a [nfs]
> > >  [<ffffffffa02c8620>] ? nfs_clone_super+0x6b/0x6b [nfs]
> > >  [<ffffffffa02c59ce>] ? nfs_set_super+0x53/0x53 [nfs]
> > >  [<ffffffff8112a846>] ? mount_fs+0x69/0x157
> > >  [<ffffffff810fb79b>] ? __alloc_percpu+0x10/0x12
> > >  [<ffffffff8113fcbd>] ? vfs_kern_mount+0x62/0xd9
> > >  [<ffffffff81141fa6>] ? do_mount+0x6ce/0x871
> > >  [<ffffffff81141833>] ? copy_mount_options+0xc2/0x12f
> > >  [<ffffffff811421ce>] ? SyS_mount+0x85/0xbe
> > >  [<ffffffff814a4292>] ? system_call_fastpath+0x16/0x1b
> > > Code: 89 e5 e8 98 ff ff ff 5d c3 0f 1f 44 00 00 55 48 89 e5 41 54 53 48 89 fb 48 83 ec 20 89 55 e0 89 75 e8 48 89 4d d8 e8 0c 99 43 00 <4c> 8b 45 d8 48 89 df 8b 55 e0 49 89 c4 31 c9 8b 75 e8 e8 b4 d0 
> > > RIP  [<ffffffff81063089>] __wake_up+0x22/0x4d
> > >  RSP <ffff88007a604028>
> > > CR2: ffff88017a604030
> > > ---[ end trace 1122f3f8cf98e4c2 ]---
> > > 
> > 
> > Yep, I think this is the same problem I reported earlier. I ran the
> > reproducer with rpc_debug turned up and ended up seeing a very similar
> > stack trace. Basically the server is returning NFS4ERR_CLID_IN_USE but
> > the client keeps retrying the call over and over.
> > 
> > I suspect that leads to some sort of recursion, but I haven't quite
> > spotted it yet.
> > 
> 
> (cc'ing Chuck since I think the problem is in the new detect_trunking code)
> 
> Ok, I think I see the problem. The looping comes from this block in
> nfs4_discover_server_trunking:
> 
> -----------------[snip]-----------------
>         case -NFS4ERR_CLID_INUSE:
>         case -NFS4ERR_WRONGSEC:
>                 clnt = rpc_clone_client_set_auth(clnt, RPC_AUTH_UNIX);
>                 if (IS_ERR(clnt)) {
>                         status = PTR_ERR(clnt);
>                         break;
>                 }
>                 /* Note: this is safe because we haven't yet marked the
>                  * client as ready, so we are the only user of
>                  * clp->cl_rpcclient
>                  */
>                 clnt = xchg(&clp->cl_rpcclient, clnt);
>                 rpc_shutdown_client(clnt);
>                 clnt = clp->cl_rpcclient;
>                 goto again;
> -----------------[snip]-----------------
> 
> ...so in the case of the reproducer, we get back -NFS4ERR_CLID_IN_USE,
> at that point we call rpc_clone_client_set_auth(), which creates a new
> rpc_clnt, but it's created as a child of the original.
> 
> When rpc_shutdown_client is called, the original clnt is not destroyed
> because the child still holds a reference to it. So, we go and try the
> call again and it fails with the same error over and over again, and we
> end up with a long chain of rpc_clnt's.
> 
> How that ends up smashing the stack, I'm not sure though.

rpc_free_client(clnt)
	rpc_release_client(clnt->cl_parent)
		rpc_free_auth(clnt)
			free_free_client(clnt)

So freeing a client with N ancestors can take N times the stack as
freeing a single client.

(Are there any other cases that can create arbitrarily long cl_parent
chains?)

--b.

> I'm also not
> sure of the remedy. It seems like we might ought to have some upper
> bound on the number of SETCLIENTID attempts?
> 
> -- 
> Jeff Layton <jlayton@redhat.com>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Thread overran stack, or stack corrupted BUG on mount
  2013-11-12 16:23     ` Chuck Lever
@ 2013-11-12 17:30       ` Myklebust, Trond
  2013-11-12 17:33         ` Chuck Lever
  2013-11-12 17:52         ` Weston Andros Adamson
  0 siblings, 2 replies; 10+ messages in thread
From: Myklebust, Trond @ 2013-11-12 17:30 UTC (permalink / raw)
  To: Chuck Lever; +Cc: Jeff Layton, Weston Andros Adamson, linux-nfs list

On Tue, 2013-11-12 at 11:23 -0500, Chuck Lever wrote:
> On Nov 12, 2013, at 11:20 AM, Jeff Layton <jlayton@redhat.com> wrote:
> > Ok, I think I see the problem. The looping comes from this block in
> > nfs4_discover_server_trunking:
> > 
> > -----------------[snip]-----------------
> >        case -NFS4ERR_CLID_INUSE:
> >        case -NFS4ERR_WRONGSEC:
> >                clnt = rpc_clone_client_set_auth(clnt, RPC_AUTH_UNIX);
> >                if (IS_ERR(clnt)) {
> >                        status = PTR_ERR(clnt);
> >                        break;
> >                }
> >                /* Note: this is safe because we haven't yet marked the
> >                 * client as ready, so we are the only user of
> >                 * clp->cl_rpcclient
> >                 */
> >                clnt = xchg(&clp->cl_rpcclient, clnt);
> >                rpc_shutdown_client(clnt);
> >                clnt = clp->cl_rpcclient;
> >                goto again;
> > -----------------[snip]-----------------
> > 
> > ...so in the case of the reproducer, we get back -NFS4ERR_CLID_IN_USE,
> > at that point we call rpc_clone_client_set_auth(), which creates a new
> > rpc_clnt, but it's created as a child of the original.
> > 
> > When rpc_shutdown_client is called, the original clnt is not destroyed
> > because the child still holds a reference to it. So, we go and try the
> > call again and it fails with the same error over and over again, and we
> > end up with a long chain of rpc_clnt's.
> > 
> > How that ends up smashing the stack, I'm not sure though. I'm also not
> > sure of the remedy. It seems like we might ought to have some upper
> > bound on the number of SETCLIENTID attempts?
> 
> CLID_INUSE is supposed to be a permanent error now.  I think one retry, if any, is all that is appropriate.

Right. If we hit CLID_INUSE in nfs4_discover_server_trunking then

a) we know this is a server that we've already mounted
b) we know that either nfs4_init_client set us up with RPC_AUTH_UNIX to
begin with, or that rpc.gssd was started only after we'd already sent a
SETCLIENTID/EXCHANGE_ID using RPC_AUTH_UNIX to this server

so the correct thing to do is to retry once if we know that we're not
already using AUTH_SYS, and then to EPERM.


Now that said, I agree that this should not be able to trigger a stack
overflow. Is this NFSv4 or NFSv4.1/NFSv4.2? Have either of you (Jeff and
Dros) tried enabling DEBUG_STACKOVERFLOW?

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Thread overran stack, or stack corrupted BUG on mount
  2013-11-12 17:30       ` Myklebust, Trond
@ 2013-11-12 17:33         ` Chuck Lever
  2013-11-12 17:41           ` Jeff Layton
  2013-11-12 17:52         ` Weston Andros Adamson
  1 sibling, 1 reply; 10+ messages in thread
From: Chuck Lever @ 2013-11-12 17:33 UTC (permalink / raw)
  To: Myklebust, Trond; +Cc: Jeff Layton, Weston Andros Adamson, linux-nfs list


On Nov 12, 2013, at 12:30 PM, "Myklebust, Trond" <Trond.Myklebust@netapp.com> wrote:

> On Tue, 2013-11-12 at 11:23 -0500, Chuck Lever wrote:
>> On Nov 12, 2013, at 11:20 AM, Jeff Layton <jlayton@redhat.com> wrote:
>>> Ok, I think I see the problem. The looping comes from this block in
>>> nfs4_discover_server_trunking:
>>> 
>>> -----------------[snip]-----------------
>>>       case -NFS4ERR_CLID_INUSE:
>>>       case -NFS4ERR_WRONGSEC:
>>>               clnt = rpc_clone_client_set_auth(clnt, RPC_AUTH_UNIX);
>>>               if (IS_ERR(clnt)) {
>>>                       status = PTR_ERR(clnt);
>>>                       break;
>>>               }
>>>               /* Note: this is safe because we haven't yet marked the
>>>                * client as ready, so we are the only user of
>>>                * clp->cl_rpcclient
>>>                */
>>>               clnt = xchg(&clp->cl_rpcclient, clnt);
>>>               rpc_shutdown_client(clnt);
>>>               clnt = clp->cl_rpcclient;
>>>               goto again;
>>> -----------------[snip]-----------------
>>> 
>>> ...so in the case of the reproducer, we get back -NFS4ERR_CLID_IN_USE,
>>> at that point we call rpc_clone_client_set_auth(), which creates a new
>>> rpc_clnt, but it's created as a child of the original.
>>> 
>>> When rpc_shutdown_client is called, the original clnt is not destroyed
>>> because the child still holds a reference to it. So, we go and try the
>>> call again and it fails with the same error over and over again, and we
>>> end up with a long chain of rpc_clnt's.
>>> 
>>> How that ends up smashing the stack, I'm not sure though. I'm also not
>>> sure of the remedy. It seems like we might ought to have some upper
>>> bound on the number of SETCLIENTID attempts?
>> 
>> CLID_INUSE is supposed to be a permanent error now.  I think one retry, if any, is all that is appropriate.
> 
> Right. If we hit CLID_INUSE in nfs4_discover_server_trunking then
> 
> a) we know this is a server that we've already mounted
> b) we know that either nfs4_init_client set us up with RPC_AUTH_UNIX to
> begin with, or that rpc.gssd was started only after we'd already sent a
> SETCLIENTID/EXCHANGE_ID using RPC_AUTH_UNIX to this server
> 
> so the correct thing to do is to retry once if we know that we're not
> already using AUTH_SYS, and then to EPERM.

Agree.  Sorry I didn't spell that out.


> Now that said, I agree that this should not be able to trigger a stack
> overflow. Is this NFSv4 or NFSv4.1/NFSv4.2? Have either of you (Jeff and
> Dros) tried enabling DEBUG_STACKOVERFLOW?
> 
> -- 
> Trond Myklebust
> Linux NFS client maintainer
> 
> NetApp
> Trond.Myklebust@netapp.com
> www.netapp.com
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Chuck Lever
chuck[dot]lever[at]oracle[dot]com





^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Thread overran stack, or stack corrupted BUG on mount
  2013-11-12 17:33         ` Chuck Lever
@ 2013-11-12 17:41           ` Jeff Layton
  0 siblings, 0 replies; 10+ messages in thread
From: Jeff Layton @ 2013-11-12 17:41 UTC (permalink / raw)
  To: Chuck Lever
  Cc: Myklebust, Trond, Weston Andros Adamson, linux-nfs list, bfields

On Tue, 12 Nov 2013 12:33:28 -0500
Chuck Lever <chuck.lever@oracle.com> wrote:

> 
> On Nov 12, 2013, at 12:30 PM, "Myklebust, Trond" <Trond.Myklebust@netapp.com> wrote:
> 
> > On Tue, 2013-11-12 at 11:23 -0500, Chuck Lever wrote:
> >> On Nov 12, 2013, at 11:20 AM, Jeff Layton <jlayton@redhat.com> wrote:
> >>> Ok, I think I see the problem. The looping comes from this block in
> >>> nfs4_discover_server_trunking:
> >>> 
> >>> -----------------[snip]-----------------
> >>>       case -NFS4ERR_CLID_INUSE:
> >>>       case -NFS4ERR_WRONGSEC:
> >>>               clnt = rpc_clone_client_set_auth(clnt, RPC_AUTH_UNIX);
> >>>               if (IS_ERR(clnt)) {
> >>>                       status = PTR_ERR(clnt);
> >>>                       break;
> >>>               }
> >>>               /* Note: this is safe because we haven't yet marked the
> >>>                * client as ready, so we are the only user of
> >>>                * clp->cl_rpcclient
> >>>                */
> >>>               clnt = xchg(&clp->cl_rpcclient, clnt);
> >>>               rpc_shutdown_client(clnt);
> >>>               clnt = clp->cl_rpcclient;
> >>>               goto again;
> >>> -----------------[snip]-----------------
> >>> 
> >>> ...so in the case of the reproducer, we get back -NFS4ERR_CLID_IN_USE,
> >>> at that point we call rpc_clone_client_set_auth(), which creates a new
> >>> rpc_clnt, but it's created as a child of the original.
> >>> 
> >>> When rpc_shutdown_client is called, the original clnt is not destroyed
> >>> because the child still holds a reference to it. So, we go and try the
> >>> call again and it fails with the same error over and over again, and we
> >>> end up with a long chain of rpc_clnt's.
> >>> 
> >>> How that ends up smashing the stack, I'm not sure though. I'm also not
> >>> sure of the remedy. It seems like we might ought to have some upper
> >>> bound on the number of SETCLIENTID attempts?
> >> 
> >> CLID_INUSE is supposed to be a permanent error now.  I think one retry, if any, is all that is appropriate.
> > 
> > Right. If we hit CLID_INUSE in nfs4_discover_server_trunking then
> > 
> > a) we know this is a server that we've already mounted
> > b) we know that either nfs4_init_client set us up with RPC_AUTH_UNIX to
> > begin with, or that rpc.gssd was started only after we'd already sent a
> > SETCLIENTID/EXCHANGE_ID using RPC_AUTH_UNIX to this server
> > 
> > so the correct thing to do is to retry once if we know that we're not
> > already using AUTH_SYS, and then to EPERM.
> 
> Agree.  Sorry I didn't spell that out.
> 
> 
> > Now that said, I agree that this should not be able to trigger a stack
> > overflow. Is this NFSv4 or NFSv4.1/NFSv4.2? Have either of you (Jeff and
> > Dros) tried enabling DEBUG_STACKOVERFLOW?
> > 

My kernel says it's on -- but the comments on stack_overflow_check
aren't encouraging for finding this sort of thing:

/*
 * Probabilistic stack overflow check:
 *
 * Only check the stack in process context, because everything else
 * runs on the big interrupt stacks. Checking reliably is too expensive,
 * so we just check from interrupts.
 */


...as to Bruce's earlier question, the recursion in how this stuff is
freed does seem a bit spooky...

Perhaps we could try doing this iteratively somehow such that it
doesn't recurse

...and/or maybe we should BUG() or WARN() if you create a chain of
clients more than 10-20 deep?

-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Thread overran stack, or stack corrupted BUG on mount
  2013-11-12 16:57     ` J. Bruce Fields
@ 2013-11-12 17:50       ` Myklebust, Trond
  0 siblings, 0 replies; 10+ messages in thread
From: Myklebust, Trond @ 2013-11-12 17:50 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Jeff Layton, Weston Andros Adamson, linux-nfs list, chuck.lever

On Tue, 2013-11-12 at 11:57 -0500, J. Bruce Fields wrote:
> On Tue, Nov 12, 2013 at 11:20:21AM -0500, Jeff Layton wrote:
> > On Tue, 12 Nov 2013 10:55:39 -0500
> > Jeff Layton <jlayton@redhat.com> wrote:
> > 
> > > On Tue, 12 Nov 2013 15:31:34 +0000
> > > Weston Andros Adamson <dros@netapp.com> wrote:
> > 
> > How that ends up smashing the stack, I'm not sure though.
> 
> rpc_free_client(clnt)
> 	rpc_release_client(clnt->cl_parent)
> 		rpc_free_auth(clnt)
> 			free_free_client(clnt)
> 
> So freeing a client with N ancestors can take N times the stack as
> freeing a single client.
> 
> (Are there any other cases that can create arbitrarily long cl_parent
> chains?)

Ewww.... At this point, that would be pretty much anything that calls
rpc_clone_client_set_auth() in response to a NFS4ERR_WRONG_SEC.

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Thread overran stack, or stack corrupted BUG on mount
  2013-11-12 17:30       ` Myklebust, Trond
  2013-11-12 17:33         ` Chuck Lever
@ 2013-11-12 17:52         ` Weston Andros Adamson
  1 sibling, 0 replies; 10+ messages in thread
From: Weston Andros Adamson @ 2013-11-12 17:52 UTC (permalink / raw)
  To: Myklebust, Trond; +Cc: Chuck Lever, Jeff Layton, linux-nfs list


On Nov 12, 2013, at 12:30 PM, Myklebust, Trond <Trond.Myklebust@netapp.com> wrote:

> On Tue, 2013-11-12 at 11:23 -0500, Chuck Lever wrote:
>> On Nov 12, 2013, at 11:20 AM, Jeff Layton <jlayton@redhat.com> wrote:
>>> Ok, I think I see the problem. The looping comes from this block in
>>> nfs4_discover_server_trunking:
>>> 
>>> -----------------[snip]-----------------
>>>       case -NFS4ERR_CLID_INUSE:
>>>       case -NFS4ERR_WRONGSEC:
>>>               clnt = rpc_clone_client_set_auth(clnt, RPC_AUTH_UNIX);
>>>               if (IS_ERR(clnt)) {
>>>                       status = PTR_ERR(clnt);
>>>                       break;
>>>               }
>>>               /* Note: this is safe because we haven't yet marked the
>>>                * client as ready, so we are the only user of
>>>                * clp->cl_rpcclient
>>>                */
>>>               clnt = xchg(&clp->cl_rpcclient, clnt);
>>>               rpc_shutdown_client(clnt);
>>>               clnt = clp->cl_rpcclient;
>>>               goto again;
>>> -----------------[snip]-----------------
>>> 
>>> ...so in the case of the reproducer, we get back -NFS4ERR_CLID_IN_USE,
>>> at that point we call rpc_clone_client_set_auth(), which creates a new
>>> rpc_clnt, but it's created as a child of the original.
>>> 
>>> When rpc_shutdown_client is called, the original clnt is not destroyed
>>> because the child still holds a reference to it. So, we go and try the
>>> call again and it fails with the same error over and over again, and we
>>> end up with a long chain of rpc_clnt's.
>>> 
>>> How that ends up smashing the stack, I'm not sure though. I'm also not
>>> sure of the remedy. It seems like we might ought to have some upper
>>> bound on the number of SETCLIENTID attempts?
>> 
>> CLID_INUSE is supposed to be a permanent error now.  I think one retry, if any, is all that is appropriate.
> 
> Right. If we hit CLID_INUSE in nfs4_discover_server_trunking then
> 
> a) we know this is a server that we've already mounted
> b) we know that either nfs4_init_client set us up with RPC_AUTH_UNIX to
> begin with, or that rpc.gssd was started only after we'd already sent a
> SETCLIENTID/EXCHANGE_ID using RPC_AUTH_UNIX to this server
> 
> so the correct thing to do is to retry once if we know that we're not
> already using AUTH_SYS, and then to EPERM.
> 
> 
> Now that said, I agree that this should not be able to trigger a stack
> overflow. Is this NFSv4 or NFSv4.1/NFSv4.2? Have either of you (Jeff and
> Dros) tried enabling DEBUG_STACKOVERFLOW?

IIRC it was a v4.0 mount when I hit this.  Yes, I have CONFIG_DEBUG_STACKOVERFLOW=y.

-dros

> 
> -- 
> Trond Myklebust
> Linux NFS client maintainer
> 
> NetApp
> Trond.Myklebust@netapp.com
> www.netapp.com


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2013-11-12 17:52 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-11-12 15:31 Thread overran stack, or stack corrupted BUG on mount Weston Andros Adamson
2013-11-12 15:55 ` Jeff Layton
2013-11-12 16:20   ` Jeff Layton
2013-11-12 16:23     ` Chuck Lever
2013-11-12 17:30       ` Myklebust, Trond
2013-11-12 17:33         ` Chuck Lever
2013-11-12 17:41           ` Jeff Layton
2013-11-12 17:52         ` Weston Andros Adamson
2013-11-12 16:57     ` J. Bruce Fields
2013-11-12 17:50       ` Myklebust, Trond

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.