All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Benjamin Coddington" <bcodding@redhat.com>
To: "Trond Myklebust" <trondmy@primarydata.com>
Cc: "anna.schumaker@netapp.com" <anna.schumaker@netapp.com>,
	"linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH 3/3] NFSv4.1: Detect and retry after OPEN and CLOSE/DOWNGRADE race
Date: Tue, 17 Oct 2017 13:33:57 -0400	[thread overview]
Message-ID: <8384C212-F32B-4F5A-B522-A8F08108DA58@redhat.com> (raw)
In-Reply-To: <1508255361.3718.17.camel@primarydata.com>

On 17 Oct 2017, at 11:49, Trond Myklebust wrote:

> On Tue, 2017-10-17 at 10:46 -0400, Benjamin Coddington wrote:
>> If the client issues two simultaneous OPEN calls, and the response to
>> the
>> first OPEN call (sequence id 1) is delayed before updating the
>> nfs4_state,
>> then the client's nfs4_state can transition through a complete
>> lifecycle of
>> OPEN (state sequence id 2), and CLOSE (state sequence id 3).  When
>> the
>> first OPEN is finally processed, the nfs4_state is incorrectly
>> transitioned
>> back to NFS_OPEN_STATE for the first OPEN (sequence id
>> 1).  Subsequent calls
>> to LOCK or CLOSE will receive NFS4ERR_BAD_STATEID, and trigger state
>> recovery.
>>
>> Fix this by passing back the result of need_update_open_stateid() to
>> the
>> open() path, with the result to try again if the OPEN's stateid
>> should not
>> be updated.
>>
>
> Why are we worried about the special case where the client actually
> finds the closed stateid in its cache?

Because I am hitting that case very frequently in generic/089, and I hate
how unnecessary state recovery slows everything down.  I'm also seeing a
problem where generic/089 never completes, and I was trying to get this
behavior out of the way.

> In the more general case of your race, the stateid might not be found
> at all because the CLOSE completes and is processed on the client
> before it can process the reply from the delayed OPEN. If so, we really
> have no way to detect that the file has actually been closed by the
> server until we see the NFS4ERR_BAD_STATEID.

I mentioned this case in the cover letter.  It's possible that the client
could retain a record of a closed stateid in order to retry an OPEN in that
case.  Another approach may be to detect 'holes' in the state id sequence
and not call CLOSE until each id is processed.  I think there's an existing
problem now where this race (without the CLOSE) can incorrectly update the
state flags with the result of the delayed OPEN's mode.

> Note also that section 18.2.4 says that "the server SHOULD return the
> invalid special stateid (the "other" value is zero and the "seqid"
> field is NFS4_UINT32_MAX, see Section 8.2.3)" which further ensures
> that we should not be able to match the OPEN stateid once the CLOSE is
> processed on the client.

Ah, right, so need_update_open_stateid() should handle that case.  But that
doesn't mean there's no way to match the OPEN statid after a CLOSE.  We
could still keep a record of the closed state with a flag instead of an
incremented sequence id.

So perhaps keeping track of holes in the sequence would be a better
approach, but then the OPEN response handling needs to call CLOSE in case
the OPEN fails.. that seems more complicated.

Ben

  reply	other threads:[~2017-10-17 17:33 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-17 14:46 [PATCH 0/3] NFSv4.1: OPEN and CLOSE/DOWNGRADE race Benjamin Coddington
2017-10-17 14:46 ` [PATCH 1/3] NFSv4: Move __update_open_stateid() into update_open_stateid() Benjamin Coddington
2017-10-17 14:46 ` [PATCH 2/3] NFSv4: Move nfs_set_open_stateid_locked " Benjamin Coddington
2017-10-17 14:46 ` [PATCH 3/3] NFSv4.1: Detect and retry after OPEN and CLOSE/DOWNGRADE race Benjamin Coddington
2017-10-17 15:49   ` Trond Myklebust
2017-10-17 17:33     ` Benjamin Coddington [this message]
2017-10-17 17:42       ` Trond Myklebust
2017-10-17 17:52         ` Benjamin Coddington
2017-10-17 18:26           ` Trond Myklebust
2017-10-17 20:29             ` Benjamin Coddington

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8384C212-F32B-4F5A-B522-A8F08108DA58@redhat.com \
    --to=bcodding@redhat.com \
    --cc=anna.schumaker@netapp.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=trondmy@primarydata.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.