All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Myklebust, Trond" <Trond.Myklebust@netapp.com>
To: Nix <nix@esperi.org.uk>
Cc: "Toralf Förster" <toralf.foerster@gmx.de>,
	"Oleg Nesterov" <oleg@redhat.com>,
	"Jeff Layton" <jlayton@redhat.com>,
	"NFS list" <linux-nfs@vger.kernel.org>,
	"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>,
	"dhowells@redhat.com" <dhowells@redhat.com>
Subject: Re: [3.10.4] NFS locking panic, plus persisting NFS shutdown panic from 3.9.*
Date: Mon, 5 Aug 2013 19:12:55 +0000	[thread overview]
Message-ID: <1375729975.7337.39.camel@leira.trondhjem.org> (raw)
In-Reply-To: <87txj4rnru.fsf@spindle.srvr.nix>

[-- Attachment #1: Type: text/plain, Size: 4559 bytes --]

On Mon, 2013-08-05 at 19:33 +0100, Nix wrote:
> On 5 Aug 2013, Trond Myklebust told this:
> > Does the attached patch fix the problem?
> 
> > From 3c50ba80105464a28d456d9a1e0f1d81d4af92a8 Mon Sep 17 00:00:00 2001
> > From: Trond Myklebust <Trond.Myklebust@netapp.com>
> > Date: Mon, 5 Aug 2013 12:06:12 -0400
> > Subject: [PATCH] LOCKD: Don't call utsname()->nodename from
> >  nlmclnt_setlockargs
> > MIME-Version: 1.0
> > Content-Type: text/plain; charset=UTF-8
> > Content-Transfer-Encoding: 8bit
> 
> It makes it worse. Much, much worse. From a crash every so often when
> I'm doing compilations over NFS, I get an immediate panic on startx,
> long long before I even try to replicate the earlier panic:
> 
> [   83.432358] task: ffff88041aaa5ac0 ti: ffff8804199e2000 task.ti: ffff8804199e2000
> [   83.432428] RIP: 0010:[<ffffffff8124af69>] [<ffffffff8124af69>] encode_nlm4_lock+0x26/0xbe
> [   83.432512] RSP: 0018:ffff8804199e3a78  EFLAGS: 00010286
> [   83.432564] RAX: 0000000000000000 RBX: ffff88041a577038 RCX: ffffffffffffffff
> [   83.432630] RDX: ffff8804193b3098 RSI: ffff88041a577038 RDI: 000000000000008c
> [   83.432697] RBP: ffff8804199e3aa8 R08: ffff8804193b3098 R09: 0000000000000001
> [   83.432763] R10: ffff88042fa12980 R11: ffff88042fa12980 R12: ffff8804199e3ae8
> [   83.432830] R13: 000000000000008c R14: ffff8804199e3fd8 R15: ffffffff815de80e
> [   83.432898] FS:  00007f594b40c740(0000) GS:ffff88042fa00000(0000) knlGS:0000000000000000
> [   83.432974] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   83.433028] CR2: 000000000000008c CR3: 000000041ab3d000 CR4: 00000000001407f0
> [   83.433095] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [   83.433176] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [   83.433255] Stack:
> [   83.433276]  ffff88041a44fb70 ffff880400000004 ffff8804199e3ae8 ffff88041a577010 
> [   83.433360]  ffff8804188e0e00 ffff8804199e3fd8 ffff8804199e3ac8 ffffffff8124b0d7 
> [   83.433443]  ffff8804188e0e00 ffffffff8124b086 ffff8804199e3b38 ffffffff815e6032 
> [   83.433616] Call Trace:
> [   83.433646]  [<ffffffff8124b0d7>] nlm4_xdr_enc_lockargs+0x51/0x76
> [   83.433707]  [<ffffffff8124b086>] ? nlm4_xdr_enc_cancargs+0x56/0x56
> [   83.433769]  [<ffffffff815e6032>] rpcauth_wrap_req+0x57/0x62
> [   83.433826]  [<ffffffff815de98a>] call_transmit+0x17c/0x1f9
> [   83.433880]  [<ffffffff815e4e58>] __rpc_execute+0xe8/0x2ca
> [   83.433935]  [<ffffffff815e50f9>] rpc_execute+0x76/0x9d
> [   83.433986]  [<ffffffff815debc1>] rpc_run_task+0x78/0x80
> [   83.434039]  [<ffffffff815decff>] rpc_call_sync+0x88/0x9e
> [   83.434092]  [<ffffffff81244b3c>] nlmclnt_call+0xb5/0x240
> [   83.434146]  [<ffffffff812454f0>] nlmclnt_proc+0x226/0x5fb
> [   83.434226]  [<ffffffff812209a2>] nfs3_proc_lock+0x21/0x23
> [   83.434280]  [<ffffffff81214a5e>] do_setlk+0x65/0xee
> [   83.434329]  [<ffffffff81214ca6>] nfs_lock+0x14e/0x162
> [   83.434382]  [<ffffffff81199661>] vfs_lock_file+0x29/0x35
> [   83.434435]  [<ffffffff8119a51d>] fcntl_setlk+0x139/0x2c5
> [   83.434490]  [<ffffffff81169621>] SyS_fcntl+0x2b6/0x47d
> [   83.434543]  [<ffffffff81613e92>] system_call_fastpath+0x16/0x1b
> [   83.434600] Code: 5b 41 5c 5d c3 0f 1f 44 00 00 55 31 c0 48 83 c9 ff 48 89 e5 41 56 41 55 41 54 49 89 fc 53 48 89 f3 48 83 ec 10 4c 8b 2e 4c 89 ef <f2> ae 4c 89 e7 48 f7 d1 4c 8d 71 ff 41 8d 76 04 e8 9f 16 3a 00 
> [   83.435077] RIP [<ffffffff8124af69>] encode_nlm4_lock+0x26/0xbe
> [   83.435140]  RSP <ffff8804199e3a78>
> [   83.435197] CR2: 000000000000008c
> 
> That's here:
> 
> (gdb) list *(encode_nlm4_lock+0x26)
> 0xffffffff8124af69 is in encode_nlm4_lock (fs/lockd/clnt4xdr.c:329).
> 324      *      string caller_name<LM_MAXSTRLEN>;
> 325      */
> 326     static void encode_caller_name(struct xdr_stream *xdr, const char *name)
> 327     {
> 328             /* NB: client-side does not set lock->len */
> 329             u32 length = strlen(name);
> 330             __be32 *p;
> 331
> 332             p = xdr_reserve_space(xdr, 4 + length);
> 333             xdr_encode_opaque(p, name, length);
> 
>    0xffffffff8124af69 <+38>:    repnz scas %es:(%rdi),%al
> 
> Pretty clearly, "name" can be NULL after this patch...
> 
Yes. This scheme will only work if we make sure that host->h_rpcclnt is
initialised at mount time. Here is a v2 patch that should do the right
thing.
-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-LOCKD-Don-t-call-utsname-nodename-from-nlmclnt_setlo.patch --]
[-- Type: text/x-patch; name="0001-LOCKD-Don-t-call-utsname-nodename-from-nlmclnt_setlo.patch", Size: 2991 bytes --]

From 9a1b6bf818e74bb7aabaecb59492b739f2f4d742 Mon Sep 17 00:00:00 2001
From: Trond Myklebust <Trond.Myklebust@netapp.com>
Date: Mon, 5 Aug 2013 12:06:12 -0400
Subject: [PATCH v2] LOCKD: Don't call utsname()->nodename from
 nlmclnt_setlockargs
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Firstly, nlmclnt_setlockargs can be called from a reclaimer thread, in
which case we're in entirely the wrong namespace.

Secondly, commit 8aac62706adaaf0fab02c4327761561c8bda9448 (move
exit_task_namespaces() outside of exit_notify()) now means that
exit_task_work() is called after exit_task_namespaces(), which
triggers an Oops when we're freeing up the locks.

Fix this by ensuring that we initialise the nlm_host's rpc_client at mount
time, so that the cl_nodename field is initialised to the value of
utsname()->nodename that the net namespace uses. Then replace the
lockd callers of utsname()->nodename.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Toralf Förster <toralf.foerster@gmx.de>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Nix <nix@esperi.org.uk>
Cc: Jeff Layton <jlayton@redhat.com>
Cc: stable@vger.kernel.org # 3.10.x
---
 fs/lockd/clntlock.c | 13 +++++++++----
 fs/lockd/clntproc.c |  5 +++--
 2 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/fs/lockd/clntlock.c b/fs/lockd/clntlock.c
index 01bfe76..41e491b 100644
--- a/fs/lockd/clntlock.c
+++ b/fs/lockd/clntlock.c
@@ -64,12 +64,17 @@ struct nlm_host *nlmclnt_init(const struct nlmclnt_initdata *nlm_init)
 				   nlm_init->protocol, nlm_version,
 				   nlm_init->hostname, nlm_init->noresvport,
 				   nlm_init->net);
-	if (host == NULL) {
-		lockd_down(nlm_init->net);
-		return ERR_PTR(-ENOLCK);
-	}
+	if (host == NULL)
+		goto out_nohost;
+	if (host->h_rpcclnt == NULL && nlm_bind_host(host) == NULL)
+		goto out_nobind;
 
 	return host;
+out_nobind:
+	nlmclnt_release_host(host);
+out_nohost:
+	lockd_down(nlm_init->net);
+	return ERR_PTR(-ENOLCK);
 }
 EXPORT_SYMBOL_GPL(nlmclnt_init);
 
diff --git a/fs/lockd/clntproc.c b/fs/lockd/clntproc.c
index 9760ecb..acd3947 100644
--- a/fs/lockd/clntproc.c
+++ b/fs/lockd/clntproc.c
@@ -125,14 +125,15 @@ static void nlmclnt_setlockargs(struct nlm_rqst *req, struct file_lock *fl)
 {
 	struct nlm_args	*argp = &req->a_args;
 	struct nlm_lock	*lock = &argp->lock;
+	char *nodename = req->a_host->h_rpcclnt->cl_nodename;
 
 	nlmclnt_next_cookie(&argp->cookie);
 	memcpy(&lock->fh, NFS_FH(file_inode(fl->fl_file)), sizeof(struct nfs_fh));
-	lock->caller  = utsname()->nodename;
+	lock->caller  = nodename;
 	lock->oh.data = req->a_owner;
 	lock->oh.len  = snprintf(req->a_owner, sizeof(req->a_owner), "%u@%s",
 				(unsigned int)fl->fl_u.nfs_fl.owner->pid,
-				utsname()->nodename);
+				nodename);
 	lock->svid = fl->fl_u.nfs_fl.owner->pid;
 	lock->fl.fl_start = fl->fl_start;
 	lock->fl.fl_end = fl->fl_end;
-- 
1.8.3.1


  reply	other threads:[~2013-08-05 19:12 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-04 15:40 [3.10.4] NFS locking panic, plus persisting NFS shutdown panic from 3.9.* Nix
2013-08-05 12:44 ` Jeff Layton
2013-08-05 14:48   ` Nix
2013-08-05 15:04     ` Jeff Layton
2013-08-05 15:11       ` Jeff Layton
2013-08-05 15:50         ` Nix
2013-08-05 16:15           ` Myklebust, Trond
2013-08-05 17:37             ` Jeff Layton
2013-08-05 18:18               ` Myklebust, Trond
2013-08-05 18:33                 ` Jeff Layton
2013-08-06  2:21                   ` Myklebust, Trond
2013-08-06  9:24                     ` Jeff Layton
2013-08-07 10:18                     ` Nix
2013-08-07 15:27                       ` Myklebust, Trond
2013-08-07 15:27                         ` Myklebust, Trond
2013-08-07 21:01                         ` Nix
2013-08-07 21:09                           ` Myklebust, Trond
2013-08-07 21:09                             ` Myklebust, Trond
2013-08-05 18:33             ` Nix
2013-08-05 19:12               ` Myklebust, Trond [this message]
2013-08-06 20:46                 ` Nix
2013-08-05 16:21           ` Jeff Layton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1375729975.7337.39.camel@leira.trondhjem.org \
    --to=trond.myklebust@netapp.com \
    --cc=dhowells@redhat.com \
    --cc=jlayton@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=nix@esperi.org.uk \
    --cc=oleg@redhat.com \
    --cc=toralf.foerster@gmx.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.