All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Benjamin Coddington" <bcodding@redhat.com>
To: "Trond Myklebust" <trondmy@primarydata.com>
Cc: "jlayton@redhat.com" <jlayton@redhat.com>,
	"linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>
Subject: Re: CLOSE/OPEN race
Date: Mon, 14 Nov 2016 09:53:28 -0500	[thread overview]
Message-ID: <8153C306-D6B6-442B-B565-BC6BAEE9993F@redhat.com> (raw)
In-Reply-To: <1479048471.3146.10.camel@primarydata.com>


On 13 Nov 2016, at 9:47, Trond Myklebust wrote:

> On Sat, 2016-11-12 at 21:56 -0500, Jeff Layton wrote:
>> On Sat, 2016-11-12 at 16:16 -0500, Jeff Layton wrote:
>>>
>>> On Sat, 2016-11-12 at 13:03 -0500, Benjamin Coddington wrote:
>>>>
>>>>
>>>> On 12 Nov 2016, at 11:52, Jeff Layton wrote:
>>>>
>>>>>
>>>>>
>>>>> On Sat, 2016-11-12 at 10:31 -0500, Benjamin Coddington wrote:
>>>>>>
>>>>>>
>>>>>> On 12 Nov 2016, at 7:54, Jeff Layton wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Sat, 2016-11-12 at 06:08 -0500, Benjamin Coddington
>>>>>>> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> I've been seeing the following on a modified version of
>>>>>>>> generic/089
>>>>>>>> that gets the client stuck sending LOCK with
>>>>>>>> NFS4ERR_OLD_STATEID.
>>>>>>>>
>>>>>>>> 1. Client has open stateid A, sends a CLOSE
>>>>>>>> 2. Client sends OPEN with same owner
>>>>>>>> 3. Client sends another OPEN with same owner
>>>>>>>> 4. Client gets a reply to OPEN in 3, stateid is B.2
>>>>>>>> (stateid B
>>>>>>>> sequence 2)
>>>>>>>> 5. Client does LOCK,LOCKU,FREE_STATEID from B.2
>>>>>>>> 6. Client gets a reply to CLOSE in 1
>>>>>>>> 7. Client gets reply to OPEN in 2, stateid is B.1
>>>>>>>> 8. Client sends LOCK with B.1 - OLD_STATEID, now stuck in
>>>>>>>> a loop
>>>>>>>>
>>>>>>>> The CLOSE response in 6 causes us to clear
>>>>>>>> NFS_OPEN_STATE, so that
>>>>>>>> the OPEN
>>>>>>>> response in 7 is able to update the open_stateid even
>>>>>>>> though it has a
>>>>>>>> lower
>>>>>>>> sequence number.
>>>>>>>>
>>>>>>>> I think this case could be handled by never updating the
>>>>>>>> open_stateid
>>>>>>>> if the
>>>>>>>> stateids match but the sequence number of the new state
>>>>>>>> is less than
>>>>>>>> the
>>>>>>>> current open_state.
>>>>>>>>
>>>>>>>
>>>>>>> What kernel is this on?
>>>>>>
>>>>>> On v4.9-rc2 with a couple fixups.  Without them, I can't test
>>>>>> long
>>>>>> enough to
>>>>>> reproduce this race.  I don't think any of those are involved
>>>>>> in this
>>>>>> problem, though.
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Yes, that seems wrong. The client should be picking B.2 for
>>>>>>> the open
>>>>>>> stateid to use. I think that decision of whether to take a
>>>>>>> seqid is
>>>>>>> made
>>>>>>> in nfs_need_update_open_stateid. The logic in there looks
>>>>>>> correct to
>>>>>>> me
>>>>>>> at first glance though.
>>>>>>
>>>>>> nfs_need_update_open_stateid() will return true if
>>>>>> NFS_OPEN_STATE is
>>>>>> unset.
>>>>>> That's the precondition set up by steps 1-6.  Perhaps it
>>>>>> should not
>>>>>> update
>>>>>> the stateid if they match but the sequence number is less,
>>>>>> and still set
>>>>>> NFS_OPEN_STATE once more.  That will fix _this_ case.  Are
>>>>>> there other
>>>>>> cases
>>>>>> where that would be a problem?
>>>>>>
>>>>>> Ben
>>>>>
>>>>> That seems wrong.
>>>>
>>>> I'm not sure what you mean: what seems wrong?
>>>>
>>>
>>> Sorry, it seems wrong that the client would issue the LOCK with B.1
>>> there.
>>>
>>>>
>>>>>
>>>>>
>>>>> The only close was sent in step 1, and that was for a
>>>>> completely different stateid (A rather than B). It seems likely
>>>>> that
>>>>> that is where the bug is.
>>>>
>>>> I'm still not sure what point you're trying to make..
>>>>
>>>> Even though the close was sent in step 1, the response wasn't
>>>> processed
>>>> until step 6..
>>>
>>> Not really a point per-se, I was just saying where I think the bug
>>> might
>>> be...
>>>
>>> When you issue a CLOSE, you issue it vs. a particular stateid
>>> (stateid
>>> "A" in this case). Once the open stateid has been superseded by
>>> "B", the
>>> closing of "A" should have no effect.
>>>
>>> Perhaps nfs_clear_open_stateid needs to check and see whether the
>>> open
>>> stateid has been superseded before doing its thing?
>>>
>>
>> Ok, I see something that might be a problem in this call in
>> nfs4_close_done:
>>
>>        nfs_clear_open_stateid(state, &calldata->arg.stateid,
>>                         res_stateid, 
>> calldata->arg.fmode);
>>
>> Note that we pass two nfs4_stateids to this call. The first is the
>> stateid that got sent in the CLOSE call, and the second is the
>> stateid
>> that came back in the CLOSE response.
>>
>> RFC5661 and RFC7530 both indicate that the stateid in a CLOSE
>> response
>> should be ignored.
>>
>> So, I think a patch like this may be in order. As to whether it will
>> fix this bug, I sort of doubt it, but it might not hurt to test it
>> out?
>>
>> ----------------------8<--------------------------
>>
>> [RFC PATCH] nfs: properly ignore the stateid in a CLOSE response
>>
>> Signed-off-by: Jeff Layton 
>> ---
>>  fs/nfs/nfs4proc.c | 14 +++-----------
>>  1 file changed, 3 insertions(+), 11 deletions(-)
>>
>> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
>> index 7897826d7c51..58413bd0aae2 100644
>> --- a/fs/nfs/nfs4proc.c
>> +++ b/fs/nfs/nfs4proc.c
>> @@ -1451,7 +1451,6 @@ static void
>> nfs_resync_open_stateid_locked(struct nfs4_state *state)
>>  }
>>  
>>  static void nfs_clear_open_stateid_locked(struct nfs4_state *state,
>> -		nfs4_stateid *arg_stateid,
>>  		nfs4_stateid *stateid, fmode_t fmode)
>>  {
>>  	clear_bit(NFS_O_RDWR_STATE, &state->flags);
>> @@ -1467,12 +1466,8 @@ static void
>> nfs_clear_open_stateid_locked(struct nfs4_state *state,
>>  		clear_bit(NFS_O_WRONLY_STATE, &state->flags);
>>  		clear_bit(NFS_OPEN_STATE, &state->flags);
>>  	}
>> -	if (stateid == NULL)
>> -		return;
>>  	/* Handle races with OPEN */
>> -	if (!nfs4_stateid_match_other(arg_stateid, &state-
>>> open_stateid) ||
>> -	    (nfs4_stateid_match_other(stateid, &state->open_stateid)
>> &&
>> -	    !nfs4_stateid_is_newer(stateid, &state->open_stateid)))
>> {
>> +	if (!nfs4_stateid_match_other(stateid, &state-
>>> open_stateid)) {
>
> No. I think what we should be doing here is
>
> 1) if (nfs4_stateid_match_other(arg_stateid, &state->open_stateid) 
> then

You must mean (!nfs4_stateid_match_other(arg_stateid, 
&state->open_stateid)

> just ignore the result and return immediately, because it applies to a
> completely different stateid.
>
> 2) if (nfs4_stateid_match_other(stateid, &state->open_stateid)
> && !nfs4_stateid_is_newer(stateid, &state->open_stateid))) then 
> resync,
> because this was likely an OPEN_DOWNGRADE that has raced with one or
> more OPEN calls.

OK, but these do not help the originally reported race because at the 
time
that the CLOSE response handling does a resync - I presume
nfs_resync_open_stateid_locked() - all the state counters are zero, 
which
bypasses resetting NFS_OPEN_STATE, which, if unset, allows any stateid 
to
update the open_stateid.

I need something like:

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index a230764b7d07..c9aa166c45aa 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -1425,8 +1425,13 @@ static void 
nfs_test_and_clear_all_open_stateid(struct nfs4_state *state)
  static bool nfs_need_update_open_stateid(struct nfs4_state *state,
                 const nfs4_stateid *stateid, nfs4_stateid *freeme)
  {
-       if (test_and_set_bit(NFS_OPEN_STATE, &state->flags) == 0)
+       if (test_and_set_bit(NFS_OPEN_STATE, &state->flags) == 0) {
+               if (nfs4_stateid_match_other(stateid, 
&state->open_stateid) &&
+                       !nfs4_stateid_is_newer(stateid, 
&state->open_stateid))
+                       return false;
                 return true;
+       }
+
         if (!nfs4_stateid_match_other(stateid, &state->open_stateid)) {
                 nfs4_stateid_copy(freeme, &state->open_stateid);
                 nfs_test_and_clear_all_open_stateid(state);

I've got that in testing, and I'll let it run for a day or so.

Another way to simplify CLOSE/OPEN races might be to never reuse an
openowner if we send a close.  Any opens after a close would have to use 
a
new mutated openowner, so the state always proceeds from open to close 
and
never back again.  Any reason not to do that?

Ben

  reply	other threads:[~2016-11-14 14:54 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-12 11:08 CLOSE/OPEN race Benjamin Coddington
2016-11-12 12:54 ` Jeff Layton
2016-11-12 15:31   ` Benjamin Coddington
2016-11-12 16:52     ` Jeff Layton
2016-11-12 18:03       ` Benjamin Coddington
2016-11-12 21:16         ` Jeff Layton
2016-11-13  2:56           ` Jeff Layton
2016-11-13 13:34             ` Benjamin Coddington
2016-11-13 14:22               ` Benjamin Coddington
2016-11-13 14:33               ` Jeff Layton
2016-11-13 14:47             ` Trond Myklebust
2016-11-14 14:53               ` Benjamin Coddington [this message]
2016-11-14 16:29                 ` Trond Myklebust
2016-11-14 18:40                   ` Benjamin Coddington
2016-11-12 18:16 ` Trond Myklebust
2016-11-12 18:46   ` Benjamin Coddington
2016-11-13  3:09   ` Jeff Layton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8153C306-D6B6-442B-B565-BC6BAEE9993F@redhat.com \
    --to=bcodding@redhat.com \
    --cc=jlayton@redhat.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=trondmy@primarydata.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.