All of lore.kernel.org
 help / color / mirror / Atom feed
* NFS Kernel Bug
@ 2014-09-18 13:02 James Drews
  2014-09-18 14:31 ` Trond Myklebust
  0 siblings, 1 reply; 4+ messages in thread
From: James Drews @ 2014-09-18 13:02 UTC (permalink / raw)
  To: linux-nfs

[-- Attachment #1: Type: text/plain, Size: 1383 bytes --]

Good morning!

I believe we have found a bug in the NFS kernel area.  The "bug" is a 
leak of a file handle where the NFS client never tells the server to 
close the file. The problem is very similar to one we had reported and 
got a fix for previously. We are using that patch, but ran in to another 
case where the client sends out an OPEN_DOWNGRADE but never sends a CLOSE.

Attached is a simple c program that we have been able to reproduce the 
bug with, along with a packet capture of what we see on the wire.

To reproduce the bug:
-compile the c code
-execute the c code with:

./test ; cat testfile3 > /dev/nul

-now if we try to remove the file we get a file in use error (server is 
using mandatory locking)

Things to note:

-if you just run the program without the immediate cat'ing of the file, 
the bug does not happen
suggesting a timing issue
-If you alter the program so the code mimics the cat of the file, the 
bug does not happen (ie, add an open, read file, close to the code).
-If you run the program as described above, and then run it again 
without the "; cat testfile3 > /dev/nul", the kernel squeaks out the 
file close to the server when the code does the close.

The attached packet capture is us doing:

./test ; cat testfile3 > /dev/null
rm testfile3
./test
rm testfile3

where we are denied the rm the first time, but not the second.


Thanks
James


[-- Attachment #2: test.c --]
[-- Type: application/softgrid-c, Size: 450 bytes --]

[-- Attachment #3: nfsbug.zip --]
[-- Type: application/x-zip-compressed, Size: 6106 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: NFS Kernel Bug
  2014-09-18 13:02 NFS Kernel Bug James Drews
@ 2014-09-18 14:31 ` Trond Myklebust
  2014-09-18 16:00   ` Trond Myklebust
  0 siblings, 1 reply; 4+ messages in thread
From: Trond Myklebust @ 2014-09-18 14:31 UTC (permalink / raw)
  To: James Drews; +Cc: Linux NFS Mailing List

On Thu, Sep 18, 2014 at 9:02 AM, James Drews <drews@engr.wisc.edu> wrote:
> Good morning!
>
> I believe we have found a bug in the NFS kernel area.  The "bug" is a leak
> of a file handle where the NFS client never tells the server to close the
> file. The problem is very similar to one we had reported and got a fix for
> previously. We are using that patch, but ran in to another case where the
> client sends out an OPEN_DOWNGRADE but never sends a CLOSE.
>
> Attached is a simple c program that we have been able to reproduce the bug
> with, along with a packet capture of what we see on the wire.
>
> To reproduce the bug:
> -compile the c code
> -execute the c code with:
>
> ./test ; cat testfile3 > /dev/nul
>
> -now if we try to remove the file we get a file in use error (server is
> using mandatory locking)
>
> Things to note:
>
> -if you just run the program without the immediate cat'ing of the file, the
> bug does not happen
> suggesting a timing issue
> -If you alter the program so the code mimics the cat of the file, the bug
> does not happen (ie, add an open, read file, close to the code).
> -If you run the program as described above, and then run it again without
> the "; cat testfile3 > /dev/nul", the kernel squeaks out the file close to
> the server when the code does the close.
>
> The attached packet capture is us doing:
>
> ./test ; cat testfile3 > /dev/null
> rm testfile3
> ./test
> rm testfile3
>
> where we are denied the rm the first time, but not the second.
>

Argh. This is a situation where the client shouldn't have called
OPEN_DOWNGRADE, but should have done a CLOSE. The issue is that the
client opens the file with OPEN4_SHARE_ACCESS_BOTH, so it is not
allowed to downgrade to OPEN4_SHARE_ACCESS_READ. Instead it should
have closed the file, and then used the delegation...

-- 
Trond Myklebust

Linux NFS client maintainer, PrimaryData

trond.myklebust@primarydata.com

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: NFS Kernel Bug
  2014-09-18 14:31 ` Trond Myklebust
@ 2014-09-18 16:00   ` Trond Myklebust
  2014-09-19 17:15     ` Ken Hahn
  0 siblings, 1 reply; 4+ messages in thread
From: Trond Myklebust @ 2014-09-18 16:00 UTC (permalink / raw)
  To: James Drews; +Cc: Linux NFS Mailing List

On Thu, 2014-09-18 at 10:31 -0400, Trond Myklebust wrote:
> On Thu, Sep 18, 2014 at 9:02 AM, James Drews <drews@engr.wisc.edu> wrote:
> > Good morning!
> >
> > I believe we have found a bug in the NFS kernel area.  The "bug" is a leak
> > of a file handle where the NFS client never tells the server to close the
> > file. The problem is very similar to one we had reported and got a fix for
> > previously. We are using that patch, but ran in to another case where the
> > client sends out an OPEN_DOWNGRADE but never sends a CLOSE.
> >
> > Attached is a simple c program that we have been able to reproduce the bug
> > with, along with a packet capture of what we see on the wire.
> >
> > To reproduce the bug:
> > -compile the c code
> > -execute the c code with:
> >
> > ./test ; cat testfile3 > /dev/nul
> >
> > -now if we try to remove the file we get a file in use error (server is
> > using mandatory locking)
> >
> > Things to note:
> >
> > -if you just run the program without the immediate cat'ing of the file, the
> > bug does not happen
> > suggesting a timing issue
> > -If you alter the program so the code mimics the cat of the file, the bug
> > does not happen (ie, add an open, read file, close to the code).
> > -If you run the program as described above, and then run it again without
> > the "; cat testfile3 > /dev/nul", the kernel squeaks out the file close to
> > the server when the code does the close.
> >
> > The attached packet capture is us doing:
> >
> > ./test ; cat testfile3 > /dev/null
> > rm testfile3
> > ./test
> > rm testfile3
> >
> > where we are denied the rm the first time, but not the second.
> >
> 
> Argh. This is a situation where the client shouldn't have called
> OPEN_DOWNGRADE, but should have done a CLOSE. The issue is that the
> client opens the file with OPEN4_SHARE_ACCESS_BOTH, so it is not
> allowed to downgrade to OPEN4_SHARE_ACCESS_READ. Instead it should
> have closed the file, and then used the delegation...
> 

Does the following patch help (and does it still fix your original
bugreport)?

Cheers
  Trond
8<----------------------------------------------------------------
>From d5f56bc67514a290032ff063d837179853231acf Mon Sep 17 00:00:00 2001
From: Trond Myklebust <trond.myklebust@primarydata.com>
Date: Thu, 18 Sep 2014 11:51:32 -0400
Subject: [PATCH] NFSv4: Fix another bug in the close/open_downgrade code

James Drew reports another bug whereby the NFS client is now sending
an OPEN_DOWNGRADE in a situation where it should really have sent a
CLOSE: the client is opening the file for O_RDWR, but then trying to
do a downgrade to O_RDONLY, which is not allowed by the NFSv4 spec.

Reported-by: James Drews <drews@engr.wisc.edu>
Link: http://lkml.kernel.org/r/541AD7E5.8020409@engr.wisc.edu
Fixes: aee7af356e15 (NFSv4: Fix problems with close in the presence...)
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfs/nfs4proc.c | 30 +++++++++++++++---------------
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index ac2dd953fc18..6ca0c8e7a945 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -2618,23 +2618,23 @@ static void nfs4_close_prepare(struct rpc_task *task, void *data)
 	is_rdwr = test_bit(NFS_O_RDWR_STATE, &state->flags);
 	is_rdonly = test_bit(NFS_O_RDONLY_STATE, &state->flags);
 	is_wronly = test_bit(NFS_O_WRONLY_STATE, &state->flags);
-	/* Calculate the current open share mode */
-	calldata->arg.fmode = 0;
-	if (is_rdonly || is_rdwr)
-		calldata->arg.fmode |= FMODE_READ;
-	if (is_wronly || is_rdwr)
-		calldata->arg.fmode |= FMODE_WRITE;
 	/* Calculate the change in open mode */
+	calldata->arg.fmode = 0;
 	if (state->n_rdwr == 0) {
-		if (state->n_rdonly == 0) {
-			call_close |= is_rdonly || is_rdwr;
-			calldata->arg.fmode &= ~FMODE_READ;
-		}
-		if (state->n_wronly == 0) {
-			call_close |= is_wronly || is_rdwr;
-			calldata->arg.fmode &= ~FMODE_WRITE;
-		}
-	}
+		if (state->n_rdonly == 0)
+			call_close |= is_rdonly;
+		else if (is_rdonly)
+			calldata->arg.fmode |= FMODE_READ;
+		if (state->n_wronly == 0)
+			call_close |= is_wronly;
+		else if (is_wronly)
+			calldata->arg.fmode |= FMODE_WRITE;
+	} else if (is_rdwr)
+		calldata->arg.fmode |= FMODE_READ|FMODE_WRITE;
+
+	if (calldata->arg.fmode == 0)
+		call_close |= is_rdwr;
+
 	if (!nfs4_valid_open_stateid(state))
 		call_close = 0;
 	spin_unlock(&state->owner->so_lock);
-- 
1.9.3




^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: NFS Kernel Bug
  2014-09-18 16:00   ` Trond Myklebust
@ 2014-09-19 17:15     ` Ken Hahn
  0 siblings, 0 replies; 4+ messages in thread
From: Ken Hahn @ 2014-09-19 17:15 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: Linux NFS Mailing List

Hello,

I work with drews@engr.wisc.edu.

On 09/18/2014 11:00 AM, Trond Myklebust wrote:
>>
>> Argh. This is a situation where the client shouldn't have called
>> OPEN_DOWNGRADE, but should have done a CLOSE. The issue is that the
>> client opens the file with OPEN4_SHARE_ACCESS_BOTH, so it is not
>> allowed to downgrade to OPEN4_SHARE_ACCESS_READ. Instead it should
>> have closed the file, and then used the delegation...
>>
> 
> Does the following patch help (and does it still fix your original
> bugreport)?

I tried it out, patching the Debian 3.14.15 kernel.. had to alter the
patch a tiny bit, but I don't think I messed up any real functionality.

Indeed it passes the old test, and avoids the new bug too.

Thanks!

-Ken Hahn

Here's the altered patch for what it is worth (for anyone else using the
current debian backports kernel):

--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -2560,20 +2560,22 @@ static void nfs4_close_prepare(struct rpc_task
*task, void *data)
        is_rdwr = test_bit(NFS_O_RDWR_STATE, &state->flags);
        is_rdonly = test_bit(NFS_O_RDONLY_STATE, &state->flags);
        is_wronly = test_bit(NFS_O_WRONLY_STATE, &state->flags);
-       if (is_rdonly || is_rdwr)
-               calldata->arg.fmode |= FMODE_READ;
-       if (is_wronly || is_rdwr)
-               calldata->arg.fmode |= FMODE_WRITE;
+       calldata->arg.fmode = 0;
        if (state->n_rdwr == 0) {
-               if (state->n_rdonly == 0) {
-                       call_close |= is_rdonly || is_rdwr;
-                       calldata->arg.fmode &= ~FMODE_READ;
-               }
-               if (state->n_wronly == 0) {
-                       call_close |= is_wronly || is_rdwr;
-                       calldata->arg.fmode &= ~FMODE_WRITE;
-               }
-       }
+               if (state->n_rdonly == 0)
+                       call_close |= is_rdonly;
+               else if (is_rdonly)
+                       calldata->arg.fmode |= FMODE_READ;
+               if (state->n_wronly == 0)
+                       call_close |= is_wronly;
+               else if (is_wronly)
+                       calldata->arg.fmode |= FMODE_WRITE;
+       } else if (is_rdwr)
+               calldata->arg.fmode |= FMODE_READ|FMODE_WRITE;
+
+       if (calldata->arg.fmode == 0)
+               call_close |= is_rdwr;
+
        if (!nfs4_valid_open_stateid(state))
                call_close = 0;
        spin_unlock(&state->owner->so_lock);
--
1.9.3




^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-09-19 17:20 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-18 13:02 NFS Kernel Bug James Drews
2014-09-18 14:31 ` Trond Myklebust
2014-09-18 16:00   ` Trond Myklebust
2014-09-19 17:15     ` Ken Hahn

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.