All of lore.kernel.org
 help / color / mirror / Atom feed
* NFSv4.1: LOCK races with returning a delegation
@ 2016-03-11 18:21 Chuck Lever
  2016-03-22 21:47 ` Chuck Lever
  0 siblings, 1 reply; 2+ messages in thread
From: Chuck Lever @ 2016-03-11 18:21 UTC (permalink / raw)
  To: Trond Myklebust, Anna Schumaker@netapp com; +Cc: Linux NFS Mailing List

Hi-

We observed some behavior at Connectathon between a
v4.5-rc6 Linux client and a prototype Solaris 12 server
when using NFSv4.1 on TCP. Test is xfstests generic/089.

At an earlier point, the client has been granted a
write delegation, denoted below by "state=w", and has
closed the file.

The sequence then observed on the wire is:

  C OPEN fh=A claim=DELEG_CUR_FH
  C LOCK fh=A zero stateid
  R OPEN NFS4_OK state=a
  R LOCK NFS4ERR_BAD_STATEID
  C TEST_STATEID state=a
  R TEST_STATEID NFS4_OK

    client reports "Lock reclaim failed!"

  C LOCK fh=A state=a
  R LOCK NFS4_OK
  C DELEGRETURN state=w
  R DELEGRETURN NFS4_OK
  C RENAME -> .nfsXXXXXXXXXX
  R RENAME NFS4_OK

I've reproduced the problem here at home. Sometimes
the LOCK operation is emitted just _before_ the
OPEN.

There are two processes involved. One is attempting
to lock the file. The other is attempting to unlink
the same file while it is still open.

I'm not sure why the client is emitting the LOCK
operation at all, since it still holds a write
delegation. The "Lock reclaim failed!" message
seems to reflect this confusion: It expects to find
and recover lock state, but there hasn't been a
successful LOCK on that file yet.

After browsing the code, I don't see any serialization
between taking a lock and returning a delegation on
the same file, but my understanding in this area comes
up short.

Is there a preferred way to serialize these two
activities (like, a particular mutex that should be
held) ?

Thanks for any guidance!

--
Chuck Lever




^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: NFSv4.1: LOCK races with returning a delegation
  2016-03-11 18:21 NFSv4.1: LOCK races with returning a delegation Chuck Lever
@ 2016-03-22 21:47 ` Chuck Lever
  0 siblings, 0 replies; 2+ messages in thread
From: Chuck Lever @ 2016-03-22 21:47 UTC (permalink / raw)
  To: Trond Myklebust, Anna Schumaker@netapp com; +Cc: Linux NFS Mailing List


> On Mar 11, 2016, at 1:21 PM, Chuck Lever <chuck.lever@oracle.com> wrote:
> 
> Hi-
> 
> We observed some behavior at Connectathon between a
> v4.5-rc6 Linux client and a prototype Solaris 12 server
> when using NFSv4.1 on TCP. Test is xfstests generic/089.
> 
> At an earlier point, the client has been granted a
> write delegation, denoted below by "state=w", and has
> closed the file.
> 
> The sequence then observed on the wire is:
> 
>  C OPEN fh=A claim=DELEG_CUR_FH
>  C LOCK fh=A zero stateid
>  R OPEN NFS4_OK state=a
>  R LOCK NFS4ERR_BAD_STATEID
>  C TEST_STATEID state=a
>  R TEST_STATEID NFS4_OK
> 
>    client reports "Lock reclaim failed!"
> 
>  C LOCK fh=A state=a
>  R LOCK NFS4_OK
>  C DELEGRETURN state=w
>  R DELEGRETURN NFS4_OK
>  C RENAME -> .nfsXXXXXXXXXX
>  R RENAME NFS4_OK
> 
> I've reproduced the problem here at home. Sometimes
> the LOCK operation is emitted just _before_ the
> OPEN.
> 
> There are two processes involved. One is attempting
> to lock the file. The other is attempting to unlink
> the same file while it is still open.
> 
> I'm not sure why the client is emitting the LOCK
> operation at all, since it still holds a write
> delegation. The "Lock reclaim failed!" message
> seems to reflect this confusion: It expects to find
> and recover lock state, but there hasn't been a
> successful LOCK on that file yet.
> 
> After browsing the code, I don't see any serialization
> between taking a lock and returning a delegation on
> the same file, but my understanding in this area comes
> up short.
> 
> Is there a preferred way to serialize these two
> activities (like, a particular mutex that should be
> held) ?
> 
> Thanks for any guidance!

Following up.

Commit 24311f884 ('NFSv4: Recovery of recalled read
delegations is broken') introduced a

  clear_bit(NFS_DELEGATED_STATE, &state->flags);

in nfs_open_delegation_recall().

That flag is cleared before the OPEN is emitted, leaving
a window where NFS_DELEGATED_STATE is clear, but there is
no valid open stateid with which to perform a LOCK.

Moving the clear_bit() into the NFS4_OK case in
nfs4_handle_delegation_recall_error() seems to eliminate
the race.


--
Chuck Lever




^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2016-03-22 21:47 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-11 18:21 NFSv4.1: LOCK races with returning a delegation Chuck Lever
2016-03-22 21:47 ` Chuck Lever

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.