linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* v3.5 nfsd4 regression; utime sometimes takes 40+ seconds to return
@ 2012-08-15  6:11 Jamie Heilman
  2012-08-15 13:58 ` Jamie Heilman
  0 siblings, 1 reply; 8+ messages in thread
From: Jamie Heilman @ 2012-08-15  6:11 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2064 bytes --]

I really wish I could have nailed this down better, but I've had a
hard time reliably reproducing the problem during bisection, and I
haven't seen anyone report a similar sounding problem.  Here's what
I've seen: since 3.5 I've been having spurious delays on my nfs
clients, noticable particularly when I open an mbox file in mutt over
an nfs v4 mount from a v3.5 or later server.  The servers I've
reproduced this on are all uni-proc 32-bit systems... but then I
haven't tried SMP or 64-bit systems yet, it may or may not exist
there.  When the delay occurs, it's quite noticable.  I've never seen
one that takes less than 40 seconds to "unstick."  I wrote a quick and
dirty reproduction tool, based on the syscalls mutt was doing that
triggered the problem, attached to this message.  To use it, compile
the file as utime-test on an exported volume, then execute with (cd
/some/mount/point && strace -T ./utime-test) from a nfs4 client.

For whatever, reason I frequently find the second call to utime takes
an irritatingly long time to return and I see something like:
utime("utime-test.c", [2012/08/14-22:47:21, 2012/08/14-17:25:21]) = 0 <70.510913>
in the strace output.

I've reproduced this on Debian Squeeze / nfs-utils 1.2.2 based servers
(legacy idmapper, no user-space nfsidmap), as well as Debian Wheezy /
nfs-utils 1.2.6 (uses keyutils upcalls) servers, so I doubt it's a
user-space related issue...  Attempts to bisect have been muddled,
I'll keep trying in the interim, but the best I've been able to pin
things down is that issue was probably introduced in the
419f4319495043a9507ac3e616be9ca60af09744 merge.  I can't repo on a
kernel based on fb21affa49204acd409328415b49bfe90136653c.  (I say
based on, because I have to apply the patch from
http://marc.info/?l=linux-nfs&m=133950479803025 or face additional
problems.)

I'll try to get full rcpdebug traces on client and server as the delay
is occuring in the hopes that helps pin things down, and post them
separately.

-- 
Jamie Heilman                     http://audible.transient.net/~jamie/

[-- Attachment #2: utime-test.c --]
[-- Type: text/x-csrc, Size: 696 bytes --]

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <time.h>
#include <utime.h>
#include <stdio.h>

#define F_PATH "utime-test.c"

int main()
{
	struct stat sb;
	struct utimbuf ub;
	int fd;

	if (stat(F_PATH, &sb) == -1) {
		perror(NULL);
		return -1;
	}
	ub.modtime = sb.st_mtime;
	ub.actime = time(NULL);
	if (utime(F_PATH, &ub) == -1) {
		perror(NULL);
		return -2;
	}
	if ((fd = open(F_PATH, O_RDONLY)) == -1) {
		perror(NULL);
		return -3;
	}
	close(fd);
	if (stat(F_PATH, &sb) == -1) {
		perror(NULL);
		return -1;
	}
	ub.modtime = sb.st_mtime;
	ub.actime = time(NULL);
	if (utime(F_PATH, &ub) == -1) {
		perror(NULL);
		return -2;
	}

	return 0;
}

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: v3.5 nfsd4 regression; utime sometimes takes 40+ seconds to return
  2012-08-15  6:11 v3.5 nfsd4 regression; utime sometimes takes 40+ seconds to return Jamie Heilman
@ 2012-08-15 13:58 ` Jamie Heilman
  2012-08-15 15:02   ` J. Bruce Fields
  0 siblings, 1 reply; 8+ messages in thread
From: Jamie Heilman @ 2012-08-15 13:58 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 938 bytes --]

Jamie Heilman wrote:
> I'll try to get full rcpdebug traces on client and server as the delay
> is occuring in the hopes that helps pin things down, and post them
> separately.

OK, here are the logs from client and server, where a run of my test program
under strace -T resulted in:

utime("utime-test.c", [2012/08/15-13:35:20, 2012/08/15-13:22:04]) = 0 <0.150815>
open("utime-test.c", O_RDONLY)          = 3 <0.242635>
close(3)                                = 0 <0.147768>
stat("utime-test.c", {st_mode=S_IFREG|0644, st_size=696, ...}) = 0 <0.002772>
utime("utime-test.c", [2012/08/15-13:35:20, 2012/08/15-13:22:04]) = 0 <71.878058>

The client is a 64-bit v3.4.8 kernel, the server is 32-bit, v3.5.1 +
the two sunprc patches that will be in v3.5.2.

(The client's system clock is a touch faster than the server's, but
these logs start at the same instant.)


-- 
Jamie Heilman                     http://audible.transient.net/~jamie/

[-- Attachment #2: server.gz --]
[-- Type: application/octet-stream, Size: 6640 bytes --]

[-- Attachment #3: client.gz --]
[-- Type: application/octet-stream, Size: 17390 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: v3.5 nfsd4 regression; utime sometimes takes 40+ seconds to return
  2012-08-15 13:58 ` Jamie Heilman
@ 2012-08-15 15:02   ` J. Bruce Fields
  2012-08-16 20:18     ` Jamie Heilman
  0 siblings, 1 reply; 8+ messages in thread
From: J. Bruce Fields @ 2012-08-15 15:02 UTC (permalink / raw)
  To: linux-nfs, linux-kernel

On Wed, Aug 15, 2012 at 01:58:54PM +0000, Jamie Heilman wrote:
> Jamie Heilman wrote:
> > I'll try to get full rcpdebug traces on client and server as the delay
> > is occuring in the hopes that helps pin things down, and post them
> > separately.
> 
> OK, here are the logs from client and server, where a run of my test program
> under strace -T resulted in:
> 
> utime("utime-test.c", [2012/08/15-13:35:20, 2012/08/15-13:22:04]) = 0 <0.150815>
> open("utime-test.c", O_RDONLY)          = 3 <0.242635>
> close(3)                                = 0 <0.147768>
> stat("utime-test.c", {st_mode=S_IFREG|0644, st_size=696, ...}) = 0 <0.002772>
> utime("utime-test.c", [2012/08/15-13:35:20, 2012/08/15-13:22:04]) = 0 <71.878058>
> 
> The client is a 64-bit v3.4.8 kernel, the server is 32-bit, v3.5.1 +
> the two sunprc patches that will be in v3.5.2.
> 
> (The client's system clock is a touch faster than the server's, but
> these logs start at the same instant.)

Thanks for all the details.

What's probably happening is that the client is returning a delegation
with the open.  The setattr then breaks that delegation; you can see it
getting 10008 (NFS4ERR_DELAY) replies while the server waits for the
delegation to be returned.  But for some reason the callback to break
the delegation isn't working.  ("NFSD: warning: no callback path to
client 192.168.2.42/192.168.2.4 tcp UNIX 0: error -110" (110 is
ETIMEDOUT).)  So instead you wait for the delegation to time out and get
forcibly revoked.

The reproducer might be more reliable if you did two opens.

It'd be worth looking at the traffic in wireshark.  You should see
setattr, open, close, setattr, a DELAY reply to the setattr, a
CB_RECALL, and then a DELEGRETURN that gets a succesful reply.  But for
some reason the DELEGRETURN isn't getting through in your case, I'm not
sure why.  I can't reproduce that.  You'll need to start wireshark
before you mount to make sure it knows how to parse the callbacks.

I don't see anything in that range of git commits that looks suspicious,
but I may well have overlooked something.

--b.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: v3.5 nfsd4 regression; utime sometimes takes 40+ seconds to return
  2012-08-15 15:02   ` J. Bruce Fields
@ 2012-08-16 20:18     ` Jamie Heilman
  2012-08-16 20:42       ` J. Bruce Fields
  0 siblings, 1 reply; 8+ messages in thread
From: Jamie Heilman @ 2012-08-16 20:18 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 3180 bytes --]

J. Bruce Fields wrote:
> On Wed, Aug 15, 2012 at 01:58:54PM +0000, Jamie Heilman wrote:
> > Jamie Heilman wrote:
> > > I'll try to get full rcpdebug traces on client and server as the delay
> > > is occuring in the hopes that helps pin things down, and post them
> > > separately.
> > 
> > OK, here are the logs from client and server, where a run of my test program
> > under strace -T resulted in:
> > 
> > utime("utime-test.c", [2012/08/15-13:35:20, 2012/08/15-13:22:04]) = 0 <0.150815>
> > open("utime-test.c", O_RDONLY)          = 3 <0.242635>
> > close(3)                                = 0 <0.147768>
> > stat("utime-test.c", {st_mode=S_IFREG|0644, st_size=696, ...}) = 0 <0.002772>
> > utime("utime-test.c", [2012/08/15-13:35:20, 2012/08/15-13:22:04]) = 0 <71.878058>
> > 
> > The client is a 64-bit v3.4.8 kernel, the server is 32-bit, v3.5.1 +
> > the two sunprc patches that will be in v3.5.2.
> > 
> > (The client's system clock is a touch faster than the server's, but
> > these logs start at the same instant.)
> 
> Thanks for all the details.
> 
> What's probably happening is that the client is returning a delegation
> with the open.  The setattr then breaks that delegation; you can see it
> getting 10008 (NFS4ERR_DELAY) replies while the server waits for the
> delegation to be returned.  But for some reason the callback to break
> the delegation isn't working.  ("NFSD: warning: no callback path to
> client 192.168.2.42/192.168.2.4 tcp UNIX 0: error -110" (110 is
> ETIMEDOUT).)  So instead you wait for the delegation to time out and get
> forcibly revoked.
> 
> The reproducer might be more reliable if you did two opens.

I made the change... not entirely sure it helped, but I think I've
bisected this reliably anyway.  It came down to:

d5497fc693a446ce9100fcf4117c3f795ddfd0d2 is the first bad commit
commit d5497fc693a446ce9100fcf4117c3f795ddfd0d2
Author: J. Bruce Fields <bfields@redhat.com>
Date:   Mon May 14 22:06:49 2012 -0400

    nfsd4: move rq_flavor into svc_cred
    
    Move the rq_flavor into struct svc_cred, and use it in setclientid and
    exchange_id comparisons as well.
    
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>

:040000 040000 165568aae0fc47067863ea8ca911494033cbf2ce 54244f0084f58df4844b09d56e8e6bad0e19d913 M      fs
:040000 040000 bae5a74a85bda2aa2b155b0bdca1ea0028502164 207751ce30c8f49c68fb0a85d0772cb0fa426ae1 M      include
:040000 040000 ef940cf0c558fe65a21a710e6c95b2cefb2996fd 459ac143acf5cb2b3b7f59df29c417cedbb23c4b M      net

FWIW reverting this commit does seem to fix the problem.

> It'd be worth looking at the traffic in wireshark.  You should see
> setattr, open, close, setattr, a DELAY reply to the setattr, a
> CB_RECALL, and then a DELEGRETURN that gets a succesful reply.  But for
> some reason the DELEGRETURN isn't getting through in your case, I'm not
> sure why.  I can't reproduce that.  You'll need to start wireshark
> before you mount to make sure it knows how to parse the callbacks.

Captures from the server showing the delay (running 3.5.2) and the
previous behavior (running 3.4.9) are attached.

-- 
Jamie Heilman                     http://audible.transient.net/~jamie/

[-- Attachment #2: nfsd4-delay.pcap.gz --]
[-- Type: application/octet-stream, Size: 8805 bytes --]

[-- Attachment #3: nfsd4-ok.pcap.gz --]
[-- Type: application/octet-stream, Size: 7270 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: v3.5 nfsd4 regression; utime sometimes takes 40+ seconds to return
  2012-08-16 20:18     ` Jamie Heilman
@ 2012-08-16 20:42       ` J. Bruce Fields
  2012-08-16 21:09         ` J. Bruce Fields
  0 siblings, 1 reply; 8+ messages in thread
From: J. Bruce Fields @ 2012-08-16 20:42 UTC (permalink / raw)
  To: linux-nfs, linux-kernel

On Thu, Aug 16, 2012 at 08:18:38PM +0000, Jamie Heilman wrote:
> J. Bruce Fields wrote:
> > On Wed, Aug 15, 2012 at 01:58:54PM +0000, Jamie Heilman wrote:
> > > Jamie Heilman wrote:
> > > > I'll try to get full rcpdebug traces on client and server as the delay
> > > > is occuring in the hopes that helps pin things down, and post them
> > > > separately.
> > > 
> > > OK, here are the logs from client and server, where a run of my test program
> > > under strace -T resulted in:
> > > 
> > > utime("utime-test.c", [2012/08/15-13:35:20, 2012/08/15-13:22:04]) = 0 <0.150815>
> > > open("utime-test.c", O_RDONLY)          = 3 <0.242635>
> > > close(3)                                = 0 <0.147768>
> > > stat("utime-test.c", {st_mode=S_IFREG|0644, st_size=696, ...}) = 0 <0.002772>
> > > utime("utime-test.c", [2012/08/15-13:35:20, 2012/08/15-13:22:04]) = 0 <71.878058>
> > > 
> > > The client is a 64-bit v3.4.8 kernel, the server is 32-bit, v3.5.1 +
> > > the two sunprc patches that will be in v3.5.2.
> > > 
> > > (The client's system clock is a touch faster than the server's, but
> > > these logs start at the same instant.)
> > 
> > Thanks for all the details.
> > 
> > What's probably happening is that the client is returning a delegation
> > with the open.  The setattr then breaks that delegation; you can see it
> > getting 10008 (NFS4ERR_DELAY) replies while the server waits for the
> > delegation to be returned.  But for some reason the callback to break
> > the delegation isn't working.  ("NFSD: warning: no callback path to
> > client 192.168.2.42/192.168.2.4 tcp UNIX 0: error -110" (110 is
> > ETIMEDOUT).)  So instead you wait for the delegation to time out and get
> > forcibly revoked.
> > 
> > The reproducer might be more reliable if you did two opens.
> 
> I made the change... not entirely sure it helped, but I think I've
> bisected this reliably anyway.  It came down to:

Hm, weird.  In the good case the cb_recall's done with auth_unix, in the
bad case with auth_null.  OK, that should be enough to go on....

Thanks for digging into this!

--b.

> 
> d5497fc693a446ce9100fcf4117c3f795ddfd0d2 is the first bad commit
> commit d5497fc693a446ce9100fcf4117c3f795ddfd0d2
> Author: J. Bruce Fields <bfields@redhat.com>
> Date:   Mon May 14 22:06:49 2012 -0400
> 
>     nfsd4: move rq_flavor into svc_cred
>     
>     Move the rq_flavor into struct svc_cred, and use it in setclientid and
>     exchange_id comparisons as well.
>     
>     Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> 
> :040000 040000 165568aae0fc47067863ea8ca911494033cbf2ce 54244f0084f58df4844b09d56e8e6bad0e19d913 M      fs
> :040000 040000 bae5a74a85bda2aa2b155b0bdca1ea0028502164 207751ce30c8f49c68fb0a85d0772cb0fa426ae1 M      include
> :040000 040000 ef940cf0c558fe65a21a710e6c95b2cefb2996fd 459ac143acf5cb2b3b7f59df29c417cedbb23c4b M      net
> 
> FWIW reverting this commit does seem to fix the problem.
> 
> > It'd be worth looking at the traffic in wireshark.  You should see
> > setattr, open, close, setattr, a DELAY reply to the setattr, a
> > CB_RECALL, and then a DELEGRETURN that gets a succesful reply.  But for
> > some reason the DELEGRETURN isn't getting through in your case, I'm not
> > sure why.  I can't reproduce that.  You'll need to start wireshark
> > before you mount to make sure it knows how to parse the callbacks.
> 
> Captures from the server showing the delay (running 3.5.2) and the
> previous behavior (running 3.4.9) are attached.
> 
> -- 
> Jamie Heilman                     http://audible.transient.net/~jamie/




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: v3.5 nfsd4 regression; utime sometimes takes 40+ seconds to return
  2012-08-16 20:42       ` J. Bruce Fields
@ 2012-08-16 21:09         ` J. Bruce Fields
  2012-08-16 21:34           ` Jamie Heilman
  0 siblings, 1 reply; 8+ messages in thread
From: J. Bruce Fields @ 2012-08-16 21:09 UTC (permalink / raw)
  To: linux-nfs, linux-kernel

On Thu, Aug 16, 2012 at 04:42:08PM -0400, J. Bruce Fields wrote:
> Hm, weird.  In the good case the cb_recall's done with auth_unix, in the
> bad case with auth_null.  OK, that should be enough to go on....
> 
> Thanks for digging into this!

This should fix it--could you confirm?

--b.

commit e950bebdac1f17121f972728489cdba43734d56d
Author: J. Bruce Fields <bfields@redhat.com>
Date:   Thu Aug 16 17:01:21 2012 -0400

    nfsd4: fix security flavor of NFSv4.0 callback
    
    Commit d5497fc693a446ce9100fcf4117c3f795ddfd0d2 "nfsd4: move rq_flavor
    into svc_cred" forgot to remove cl_flavor from the client, leaving two
    places (cl_flavor and cl_cred.cr_flavor) for the flavor to be stored.
    After that patch, the latter was the one that was updated, but the
    former was the one that the callback used.
    
    Symptoms were a long delay on utime().  This is because the utime()
    generated a setattr which recalled a delegation, but the cb_recall was
    ignored by the client because it had the wrong security flavor.
    
    Cc: stable@vger.kernel.org
    Reported-by: Jamie Heilman <jamie@audible.transient.net>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index cbaf4f8..4c7bd35 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -651,12 +651,12 @@ static int setup_callback_client(struct nfs4_client *clp, struct nfs4_cb_conn *c
 
 	if (clp->cl_minorversion == 0) {
 		if (!clp->cl_cred.cr_principal &&
-				(clp->cl_flavor >= RPC_AUTH_GSS_KRB5))
+				(clp->cl_cred.cr_flavor >= RPC_AUTH_GSS_KRB5))
 			return -EINVAL;
 		args.client_name = clp->cl_cred.cr_principal;
 		args.prognumber	= conn->cb_prog,
 		args.protocol = XPRT_TRANSPORT_TCP;
-		args.authflavor = clp->cl_flavor;
+		args.authflavor = clp->cl_cred.cr_flavor;
 		clp->cl_cb_ident = conn->cb_ident;
 	} else {
 		if (!conn->cb_xprt)
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 167d7d8..9db0bb5 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -231,7 +231,6 @@ struct nfs4_client {
 	nfs4_verifier		cl_verifier; 	/* generated by client */
 	time_t                  cl_time;        /* time of last lease renewal */
 	struct sockaddr_storage	cl_addr; 	/* client ipaddress */
-	u32			cl_flavor;	/* setclientid pseudoflavor */
 	struct svc_cred		cl_cred; 	/* setclientid principal */
 	clientid_t		cl_clientid;	/* generated by server */
 	nfs4_verifier		cl_confirm;	/* generated by server */

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: v3.5 nfsd4 regression; utime sometimes takes 40+ seconds to return
  2012-08-16 21:09         ` J. Bruce Fields
@ 2012-08-16 21:34           ` Jamie Heilman
  2012-08-17 13:08             ` J. Bruce Fields
  0 siblings, 1 reply; 8+ messages in thread
From: Jamie Heilman @ 2012-08-16 21:34 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs, linux-kernel

J. Bruce Fields wrote:
> On Thu, Aug 16, 2012 at 04:42:08PM -0400, J. Bruce Fields wrote:
> > Hm, weird.  In the good case the cb_recall's done with auth_unix, in the
> > bad case with auth_null.  OK, that should be enough to go on....
> > 
> > Thanks for digging into this!
> 
> This should fix it--could you confirm?

Yep, looks like that fixes it.

> commit e950bebdac1f17121f972728489cdba43734d56d
> Author: J. Bruce Fields <bfields@redhat.com>
> Date:   Thu Aug 16 17:01:21 2012 -0400
> 
>     nfsd4: fix security flavor of NFSv4.0 callback
>     
>     Commit d5497fc693a446ce9100fcf4117c3f795ddfd0d2 "nfsd4: move rq_flavor
>     into svc_cred" forgot to remove cl_flavor from the client, leaving two
>     places (cl_flavor and cl_cred.cr_flavor) for the flavor to be stored.
>     After that patch, the latter was the one that was updated, but the
>     former was the one that the callback used.
>     
>     Symptoms were a long delay on utime().  This is because the utime()
>     generated a setattr which recalled a delegation, but the cb_recall was
>     ignored by the client because it had the wrong security flavor.
>     
>     Cc: stable@vger.kernel.org
>     Reported-by: Jamie Heilman <jamie@audible.transient.net>
>     Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> 
> diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> index cbaf4f8..4c7bd35 100644
> --- a/fs/nfsd/nfs4callback.c
> +++ b/fs/nfsd/nfs4callback.c
> @@ -651,12 +651,12 @@ static int setup_callback_client(struct nfs4_client *clp, struct nfs4_cb_conn *c
>  
>  	if (clp->cl_minorversion == 0) {
>  		if (!clp->cl_cred.cr_principal &&
> -				(clp->cl_flavor >= RPC_AUTH_GSS_KRB5))
> +				(clp->cl_cred.cr_flavor >= RPC_AUTH_GSS_KRB5))
>  			return -EINVAL;
>  		args.client_name = clp->cl_cred.cr_principal;
>  		args.prognumber	= conn->cb_prog,
>  		args.protocol = XPRT_TRANSPORT_TCP;
> -		args.authflavor = clp->cl_flavor;
> +		args.authflavor = clp->cl_cred.cr_flavor;
>  		clp->cl_cb_ident = conn->cb_ident;
>  	} else {
>  		if (!conn->cb_xprt)
> diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
> index 167d7d8..9db0bb5 100644
> --- a/fs/nfsd/state.h
> +++ b/fs/nfsd/state.h
> @@ -231,7 +231,6 @@ struct nfs4_client {
>  	nfs4_verifier		cl_verifier; 	/* generated by client */
>  	time_t                  cl_time;        /* time of last lease renewal */
>  	struct sockaddr_storage	cl_addr; 	/* client ipaddress */
> -	u32			cl_flavor;	/* setclientid pseudoflavor */
>  	struct svc_cred		cl_cred; 	/* setclientid principal */
>  	clientid_t		cl_clientid;	/* generated by server */
>  	nfs4_verifier		cl_confirm;	/* generated by server */
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
Jamie Heilman                     http://audible.transient.net/~jamie/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: v3.5 nfsd4 regression; utime sometimes takes 40+ seconds to return
  2012-08-16 21:34           ` Jamie Heilman
@ 2012-08-17 13:08             ` J. Bruce Fields
  0 siblings, 0 replies; 8+ messages in thread
From: J. Bruce Fields @ 2012-08-17 13:08 UTC (permalink / raw)
  To: linux-nfs, linux-kernel

On Thu, Aug 16, 2012 at 09:34:24PM +0000, Jamie Heilman wrote:
> J. Bruce Fields wrote:
> > On Thu, Aug 16, 2012 at 04:42:08PM -0400, J. Bruce Fields wrote:
> > > Hm, weird.  In the good case the cb_recall's done with auth_unix, in the
> > > bad case with auth_null.  OK, that should be enough to go on....
> > > 
> > > Thanks for digging into this!
> > 
> > This should fix it--could you confirm?
> 
> Yep, looks like that fixes it.

Thanks!

I'll queue this up for 3.6 and stable.

--b.

> 
> > commit e950bebdac1f17121f972728489cdba43734d56d
> > Author: J. Bruce Fields <bfields@redhat.com>
> > Date:   Thu Aug 16 17:01:21 2012 -0400
> > 
> >     nfsd4: fix security flavor of NFSv4.0 callback
> >     
> >     Commit d5497fc693a446ce9100fcf4117c3f795ddfd0d2 "nfsd4: move rq_flavor
> >     into svc_cred" forgot to remove cl_flavor from the client, leaving two
> >     places (cl_flavor and cl_cred.cr_flavor) for the flavor to be stored.
> >     After that patch, the latter was the one that was updated, but the
> >     former was the one that the callback used.
> >     
> >     Symptoms were a long delay on utime().  This is because the utime()
> >     generated a setattr which recalled a delegation, but the cb_recall was
> >     ignored by the client because it had the wrong security flavor.
> >     
> >     Cc: stable@vger.kernel.org
> >     Reported-by: Jamie Heilman <jamie@audible.transient.net>
> >     Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> > 
> > diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> > index cbaf4f8..4c7bd35 100644
> > --- a/fs/nfsd/nfs4callback.c
> > +++ b/fs/nfsd/nfs4callback.c
> > @@ -651,12 +651,12 @@ static int setup_callback_client(struct nfs4_client *clp, struct nfs4_cb_conn *c
> >  
> >  	if (clp->cl_minorversion == 0) {
> >  		if (!clp->cl_cred.cr_principal &&
> > -				(clp->cl_flavor >= RPC_AUTH_GSS_KRB5))
> > +				(clp->cl_cred.cr_flavor >= RPC_AUTH_GSS_KRB5))
> >  			return -EINVAL;
> >  		args.client_name = clp->cl_cred.cr_principal;
> >  		args.prognumber	= conn->cb_prog,
> >  		args.protocol = XPRT_TRANSPORT_TCP;
> > -		args.authflavor = clp->cl_flavor;
> > +		args.authflavor = clp->cl_cred.cr_flavor;
> >  		clp->cl_cb_ident = conn->cb_ident;
> >  	} else {
> >  		if (!conn->cb_xprt)
> > diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
> > index 167d7d8..9db0bb5 100644
> > --- a/fs/nfsd/state.h
> > +++ b/fs/nfsd/state.h
> > @@ -231,7 +231,6 @@ struct nfs4_client {
> >  	nfs4_verifier		cl_verifier; 	/* generated by client */
> >  	time_t                  cl_time;        /* time of last lease renewal */
> >  	struct sockaddr_storage	cl_addr; 	/* client ipaddress */
> > -	u32			cl_flavor;	/* setclientid pseudoflavor */
> >  	struct svc_cred		cl_cred; 	/* setclientid principal */
> >  	clientid_t		cl_clientid;	/* generated by server */
> >  	nfs4_verifier		cl_confirm;	/* generated by server */
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> 
> -- 
> Jamie Heilman                     http://audible.transient.net/~jamie/
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2012-08-17 13:08 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-08-15  6:11 v3.5 nfsd4 regression; utime sometimes takes 40+ seconds to return Jamie Heilman
2012-08-15 13:58 ` Jamie Heilman
2012-08-15 15:02   ` J. Bruce Fields
2012-08-16 20:18     ` Jamie Heilman
2012-08-16 20:42       ` J. Bruce Fields
2012-08-16 21:09         ` J. Bruce Fields
2012-08-16 21:34           ` Jamie Heilman
2012-08-17 13:08             ` J. Bruce Fields

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).