All of lore.kernel.org
 help / color / mirror / Atom feed
From: Olga Kornievskaia <aglo@umich.edu>
To: Trond Myklebust <trond.myklebust@primarydata.com>
Cc: linux-nfs <linux-nfs@vger.kernel.org>
Subject: Re: is receiving BAD_STATEID during IO on delegated stateid unrecoverable?
Date: Thu, 16 Apr 2015 14:07:15 -0400	[thread overview]
Message-ID: <CAN-5tyGSEURn62cawEbudhQ5gsEdRD+HB7qKjYRS-3L-OLsPrQ@mail.gmail.com> (raw)
In-Reply-To: <CAN-5tyEx_Pp0q58EiYSLLAZZ1cR4Tn9=gkD7b_pubsZdXymsFg@mail.gmail.com>

Which of the two solutions would you prefer as fix?

Problem statement: SETATTR is sent with a delegation stateid after the
original open was closed so we have no open state. When the server
fails the SETATTR with stateid-type of error BAD_STATEID or
ADMIN_REVOKED, client fails to recover because it has no open state to
initiate recovery and instead fails with EIO.

Solution #1: When we need to send a SETATTR and we have no state, use
zero stateid instead. Disadvantage of this approach is that if the
delegation state is actually valid, then it'll force a delegation to
be recalled by the server.
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index ad7cf7e..4dda0f1 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -2485,7 +2485,8 @@ static int _nfs4_do_setattr(struct inode *inode, struct rp
        truncate = (sattr->ia_valid & ATTR_SIZE) ? true : false;
        fmode = truncate ? FMODE_WRITE : FMODE_READ;

-       if (nfs4_copy_delegation_stateid(&arg.stateid, inode, fmode)) {
+       if (state != NULL &&
+               nfs4_copy_delegation_stateid(&arg.stateid, inode, fmode)) {
                /* Use that stateid */
        } else if (truncate && state != NULL) {

Solution #2: If we get a stateid-like error on a SETATTR and we have
no state, return/remove bad delegation and retry the call again which
will have the code pick the zero stateid.

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index ad7cf7e..be16143 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -2535,6 +2535,14 @@ static int nfs4_do_setattr(struct inode *inode, struct rp
                                        err = -EACCES;
                                goto out;
                        }
+               case -NFS4ERR_DELEG_REVOKED:
+               case -NFS4ERR_ADMIN_REVOKED:
+               case -NFS4ERR_BAD_STATEID:
+                       if (state == NULL) {
+                               nfs4_inode_return_delegation(inode);
+                               exception.retry = 1;
+                               continue;
+                       }
                }
                err = nfs4_handle_exception(server, err, &exception);
        } while (exception.retry);

On Wed, Apr 15, 2015 at 2:27 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
> Hi Trond,
>
> I'm resurrecting an old client received "BAD_STATEID" using delegation
> stateid on some operation thread.... If client used a delegation
> stateid on a SETATTR (i.e. for open truncate) and received this error,
> does this also lead to unrecoverable state or do you think client
> should try recover? I can see the same argument that if state was
> revoked another client could have change the file already.
>
> If you think it's recoverable, there is a bug in the client because it
> doesn't recover. I can explain why but there is no need if this is an
> acceptable behavior.
>
> Thanks.
>
> On Thu, Nov 20, 2014 at 4:14 PM, Trond Myklebust
> <trond.myklebust@primarydata.com> wrote:
>> On Thu, Nov 20, 2014 at 12:57 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
>>> Hi folks,
>>>
>>> As far as I can tell, receiving a BAD_STATEID on an IO operation on a
>>> delegated stateid when there was a local lock acquired for this IO is
>>> unrecoverable — leads to EIO. Codewise, stateid recovery sees that it
>>> has a local lock and marks it lost and retry of the IO operation
>>> returns the EIO.
>>>
>>> Is the reason for seizing the IO is that if the server for some reason
>>> revoked this stateid then there is no guarantee about the locks and
>>> thus doing any IO.
>>>
>>> This also applies to both 4.0 and 4.1 code as far as I can tell.
>>>
>>> Can somebody confirm or tell me if this is wrong?
>>>
>>
>> Yes. If the server has lost the lock, then the application has lost
>> atomicity for the set of operations that were supposed to be protected
>> by that lock, and this is why we return the EIO. In older kernels we
>> did try to recover the lock, but that can lead to application-visible
>> corruption of data, and so we no longer do that unless you explicitly
>> set the nfs 'recover_lost_locks' module parameter - see
>> Documentation/kernel-parameters.txt.
>>
>> --
>> Trond Myklebust
>>
>> Linux NFS client maintainer, PrimaryData
>>
>> trond.myklebust@primarydata.com

  reply	other threads:[~2015-04-16 18:07 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-11-20 20:57 is receiving BAD_STATEID during IO on delegated stateid unrecoverable? Olga Kornievskaia
2014-11-20 21:14 ` Trond Myklebust
2014-11-20 21:23   ` Olga Kornievskaia
2015-04-15 18:27   ` Olga Kornievskaia
2015-04-16 18:07     ` Olga Kornievskaia [this message]
2015-04-21 13:29       ` Olga Kornievskaia

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAN-5tyGSEURn62cawEbudhQ5gsEdRD+HB7qKjYRS-3L-OLsPrQ@mail.gmail.com \
    --to=aglo@umich.edu \
    --cc=linux-nfs@vger.kernel.org \
    --cc=trond.myklebust@primarydata.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.