* NFS Kernel Bug @ 2014-09-18 13:02 James Drews 2014-09-18 14:31 ` Trond Myklebust 0 siblings, 1 reply; 4+ messages in thread From: James Drews @ 2014-09-18 13:02 UTC (permalink / raw) To: linux-nfs [-- Attachment #1: Type: text/plain, Size: 1383 bytes --] Good morning! I believe we have found a bug in the NFS kernel area. The "bug" is a leak of a file handle where the NFS client never tells the server to close the file. The problem is very similar to one we had reported and got a fix for previously. We are using that patch, but ran in to another case where the client sends out an OPEN_DOWNGRADE but never sends a CLOSE. Attached is a simple c program that we have been able to reproduce the bug with, along with a packet capture of what we see on the wire. To reproduce the bug: -compile the c code -execute the c code with: ./test ; cat testfile3 > /dev/nul -now if we try to remove the file we get a file in use error (server is using mandatory locking) Things to note: -if you just run the program without the immediate cat'ing of the file, the bug does not happen suggesting a timing issue -If you alter the program so the code mimics the cat of the file, the bug does not happen (ie, add an open, read file, close to the code). -If you run the program as described above, and then run it again without the "; cat testfile3 > /dev/nul", the kernel squeaks out the file close to the server when the code does the close. The attached packet capture is us doing: ./test ; cat testfile3 > /dev/null rm testfile3 ./test rm testfile3 where we are denied the rm the first time, but not the second. Thanks James [-- Attachment #2: test.c --] [-- Type: application/softgrid-c, Size: 450 bytes --] [-- Attachment #3: nfsbug.zip --] [-- Type: application/x-zip-compressed, Size: 6106 bytes --] ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: NFS Kernel Bug 2014-09-18 13:02 NFS Kernel Bug James Drews @ 2014-09-18 14:31 ` Trond Myklebust 2014-09-18 16:00 ` Trond Myklebust 0 siblings, 1 reply; 4+ messages in thread From: Trond Myklebust @ 2014-09-18 14:31 UTC (permalink / raw) To: James Drews; +Cc: Linux NFS Mailing List On Thu, Sep 18, 2014 at 9:02 AM, James Drews <drews@engr.wisc.edu> wrote: > Good morning! > > I believe we have found a bug in the NFS kernel area. The "bug" is a leak > of a file handle where the NFS client never tells the server to close the > file. The problem is very similar to one we had reported and got a fix for > previously. We are using that patch, but ran in to another case where the > client sends out an OPEN_DOWNGRADE but never sends a CLOSE. > > Attached is a simple c program that we have been able to reproduce the bug > with, along with a packet capture of what we see on the wire. > > To reproduce the bug: > -compile the c code > -execute the c code with: > > ./test ; cat testfile3 > /dev/nul > > -now if we try to remove the file we get a file in use error (server is > using mandatory locking) > > Things to note: > > -if you just run the program without the immediate cat'ing of the file, the > bug does not happen > suggesting a timing issue > -If you alter the program so the code mimics the cat of the file, the bug > does not happen (ie, add an open, read file, close to the code). > -If you run the program as described above, and then run it again without > the "; cat testfile3 > /dev/nul", the kernel squeaks out the file close to > the server when the code does the close. > > The attached packet capture is us doing: > > ./test ; cat testfile3 > /dev/null > rm testfile3 > ./test > rm testfile3 > > where we are denied the rm the first time, but not the second. > Argh. This is a situation where the client shouldn't have called OPEN_DOWNGRADE, but should have done a CLOSE. The issue is that the client opens the file with OPEN4_SHARE_ACCESS_BOTH, so it is not allowed to downgrade to OPEN4_SHARE_ACCESS_READ. Instead it should have closed the file, and then used the delegation... -- Trond Myklebust Linux NFS client maintainer, PrimaryData trond.myklebust@primarydata.com ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: NFS Kernel Bug 2014-09-18 14:31 ` Trond Myklebust @ 2014-09-18 16:00 ` Trond Myklebust 2014-09-19 17:15 ` Ken Hahn 0 siblings, 1 reply; 4+ messages in thread From: Trond Myklebust @ 2014-09-18 16:00 UTC (permalink / raw) To: James Drews; +Cc: Linux NFS Mailing List On Thu, 2014-09-18 at 10:31 -0400, Trond Myklebust wrote: > On Thu, Sep 18, 2014 at 9:02 AM, James Drews <drews@engr.wisc.edu> wrote: > > Good morning! > > > > I believe we have found a bug in the NFS kernel area. The "bug" is a leak > > of a file handle where the NFS client never tells the server to close the > > file. The problem is very similar to one we had reported and got a fix for > > previously. We are using that patch, but ran in to another case where the > > client sends out an OPEN_DOWNGRADE but never sends a CLOSE. > > > > Attached is a simple c program that we have been able to reproduce the bug > > with, along with a packet capture of what we see on the wire. > > > > To reproduce the bug: > > -compile the c code > > -execute the c code with: > > > > ./test ; cat testfile3 > /dev/nul > > > > -now if we try to remove the file we get a file in use error (server is > > using mandatory locking) > > > > Things to note: > > > > -if you just run the program without the immediate cat'ing of the file, the > > bug does not happen > > suggesting a timing issue > > -If you alter the program so the code mimics the cat of the file, the bug > > does not happen (ie, add an open, read file, close to the code). > > -If you run the program as described above, and then run it again without > > the "; cat testfile3 > /dev/nul", the kernel squeaks out the file close to > > the server when the code does the close. > > > > The attached packet capture is us doing: > > > > ./test ; cat testfile3 > /dev/null > > rm testfile3 > > ./test > > rm testfile3 > > > > where we are denied the rm the first time, but not the second. > > > > Argh. This is a situation where the client shouldn't have called > OPEN_DOWNGRADE, but should have done a CLOSE. The issue is that the > client opens the file with OPEN4_SHARE_ACCESS_BOTH, so it is not > allowed to downgrade to OPEN4_SHARE_ACCESS_READ. Instead it should > have closed the file, and then used the delegation... > Does the following patch help (and does it still fix your original bugreport)? Cheers Trond 8<---------------------------------------------------------------- >From d5f56bc67514a290032ff063d837179853231acf Mon Sep 17 00:00:00 2001 From: Trond Myklebust <trond.myklebust@primarydata.com> Date: Thu, 18 Sep 2014 11:51:32 -0400 Subject: [PATCH] NFSv4: Fix another bug in the close/open_downgrade code James Drew reports another bug whereby the NFS client is now sending an OPEN_DOWNGRADE in a situation where it should really have sent a CLOSE: the client is opening the file for O_RDWR, but then trying to do a downgrade to O_RDONLY, which is not allowed by the NFSv4 spec. Reported-by: James Drews <drews@engr.wisc.edu> Link: http://lkml.kernel.org/r/541AD7E5.8020409@engr.wisc.edu Fixes: aee7af356e15 (NFSv4: Fix problems with close in the presence...) Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> --- fs/nfs/nfs4proc.c | 30 +++++++++++++++--------------- 1 file changed, 15 insertions(+), 15 deletions(-) diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index ac2dd953fc18..6ca0c8e7a945 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -2618,23 +2618,23 @@ static void nfs4_close_prepare(struct rpc_task *task, void *data) is_rdwr = test_bit(NFS_O_RDWR_STATE, &state->flags); is_rdonly = test_bit(NFS_O_RDONLY_STATE, &state->flags); is_wronly = test_bit(NFS_O_WRONLY_STATE, &state->flags); - /* Calculate the current open share mode */ - calldata->arg.fmode = 0; - if (is_rdonly || is_rdwr) - calldata->arg.fmode |= FMODE_READ; - if (is_wronly || is_rdwr) - calldata->arg.fmode |= FMODE_WRITE; /* Calculate the change in open mode */ + calldata->arg.fmode = 0; if (state->n_rdwr == 0) { - if (state->n_rdonly == 0) { - call_close |= is_rdonly || is_rdwr; - calldata->arg.fmode &= ~FMODE_READ; - } - if (state->n_wronly == 0) { - call_close |= is_wronly || is_rdwr; - calldata->arg.fmode &= ~FMODE_WRITE; - } - } + if (state->n_rdonly == 0) + call_close |= is_rdonly; + else if (is_rdonly) + calldata->arg.fmode |= FMODE_READ; + if (state->n_wronly == 0) + call_close |= is_wronly; + else if (is_wronly) + calldata->arg.fmode |= FMODE_WRITE; + } else if (is_rdwr) + calldata->arg.fmode |= FMODE_READ|FMODE_WRITE; + + if (calldata->arg.fmode == 0) + call_close |= is_rdwr; + if (!nfs4_valid_open_stateid(state)) call_close = 0; spin_unlock(&state->owner->so_lock); -- 1.9.3 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: NFS Kernel Bug 2014-09-18 16:00 ` Trond Myklebust @ 2014-09-19 17:15 ` Ken Hahn 0 siblings, 0 replies; 4+ messages in thread From: Ken Hahn @ 2014-09-19 17:15 UTC (permalink / raw) To: Trond Myklebust; +Cc: Linux NFS Mailing List Hello, I work with drews@engr.wisc.edu. On 09/18/2014 11:00 AM, Trond Myklebust wrote: >> >> Argh. This is a situation where the client shouldn't have called >> OPEN_DOWNGRADE, but should have done a CLOSE. The issue is that the >> client opens the file with OPEN4_SHARE_ACCESS_BOTH, so it is not >> allowed to downgrade to OPEN4_SHARE_ACCESS_READ. Instead it should >> have closed the file, and then used the delegation... >> > > Does the following patch help (and does it still fix your original > bugreport)? I tried it out, patching the Debian 3.14.15 kernel.. had to alter the patch a tiny bit, but I don't think I messed up any real functionality. Indeed it passes the old test, and avoids the new bug too. Thanks! -Ken Hahn Here's the altered patch for what it is worth (for anyone else using the current debian backports kernel): --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -2560,20 +2560,22 @@ static void nfs4_close_prepare(struct rpc_task *task, void *data) is_rdwr = test_bit(NFS_O_RDWR_STATE, &state->flags); is_rdonly = test_bit(NFS_O_RDONLY_STATE, &state->flags); is_wronly = test_bit(NFS_O_WRONLY_STATE, &state->flags); - if (is_rdonly || is_rdwr) - calldata->arg.fmode |= FMODE_READ; - if (is_wronly || is_rdwr) - calldata->arg.fmode |= FMODE_WRITE; + calldata->arg.fmode = 0; if (state->n_rdwr == 0) { - if (state->n_rdonly == 0) { - call_close |= is_rdonly || is_rdwr; - calldata->arg.fmode &= ~FMODE_READ; - } - if (state->n_wronly == 0) { - call_close |= is_wronly || is_rdwr; - calldata->arg.fmode &= ~FMODE_WRITE; - } - } + if (state->n_rdonly == 0) + call_close |= is_rdonly; + else if (is_rdonly) + calldata->arg.fmode |= FMODE_READ; + if (state->n_wronly == 0) + call_close |= is_wronly; + else if (is_wronly) + calldata->arg.fmode |= FMODE_WRITE; + } else if (is_rdwr) + calldata->arg.fmode |= FMODE_READ|FMODE_WRITE; + + if (calldata->arg.fmode == 0) + call_close |= is_rdwr; + if (!nfs4_valid_open_stateid(state)) call_close = 0; spin_unlock(&state->owner->so_lock); -- 1.9.3 ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2014-09-19 17:20 UTC | newest] Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2014-09-18 13:02 NFS Kernel Bug James Drews 2014-09-18 14:31 ` Trond Myklebust 2014-09-18 16:00 ` Trond Myklebust 2014-09-19 17:15 ` Ken Hahn
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.