All of lore.kernel.org
 help / color / mirror / Atom feed
From: Trond Myklebust <trond.myklebust@primarydata.com>
To: Olga Kornievskaia <aglo@umich.edu>
Cc: linux-nfs <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH 1/1] Fixing infinite state recovery loop due to failed delegation return
Date: Thu, 6 Nov 2014 14:15:58 -0500	[thread overview]
Message-ID: <CAHQdGtQw_7tWuGEZOwdUWN6TEaTn85kkmAQb_nPF5QaTrk487w@mail.gmail.com> (raw)
In-Reply-To: <CAN-5tyE3pH0Y-RRGEoufQ-YWBJ=CQ_fBa0UJN0EF=e+TiHEzVQ@mail.gmail.com>

On Thu, Nov 6, 2014 at 1:54 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
> On Wed, Nov 5, 2014 at 1:38 PM, Trond Myklebust
> <trond.myklebust@primarydata.com> wrote:
>> On Wed, Nov 5, 2014 at 12:53 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
>>> Thank you for looking into this!
>>>
>>> On Tue, Nov 4, 2014 at 7:14 PM, Trond Myklebust
>>> <trond.myklebust@primarydata.com> wrote:
>>>> On Tue, Nov 4, 2014 at 6:22 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
>>>>> Trond, can you please respond to the patch?
>>>>>
>>>>> As per earlier conversation, in this solution, state recovery is
>>>>> initiated which marks the locks lost.
>>>>>
>>>>> Please either accept this patch or let me know what needs to be fixed.
>>>>>
>>>>
>>>> Please see the 3 fixes I just sent out concerning delegation recovery
>>>> w.r.t. NFSv4+NFSv4.1. In addition, we need to handle the case you
>>>> patch attempts to address (however see the question I have below).
>>>>
>>>>> Thank you.
>>>>>
>>>>> On Mon, Nov 3, 2014 at 9:36 AM, Olga Kornievskaia <kolga@netapp.com> wrote:
>>>>>> If we get a bad-stateid-type of error when we send OPEN with delegate_cur
>>>>>> to return currently held delegation, we shouldn't be trying to reclaim locks
>>>>>> associated with that delegation state_id because we don't have an
>>>>>> open_stateid to be used for the LOCK operation. Thus, we should
>>>>>> return an error from the nfs4_open_delegation_recall() in that case.
>>>>>>
>>>>>> Furthermore, if an error occurs the delegation code will call
>>>>>> nfs_abort_delegation_return() which sets again the NFS4CLNT_DELEGRETURN
>>>>>> flags in the state and it leads the state manager to into an infinite loop
>>>>>> for trying to reclaim the delegated state.
>>>>>>
>>>>>> Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
>>>>>> ---
>>>>>>  fs/nfs/delegation.c |    5 +++--
>>>>>>  fs/nfs/nfs4proc.c   |    2 +-
>>>>>>  2 files changed, 4 insertions(+), 3 deletions(-)
>>>>>>
>>>>>> diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c
>>>>>> index 5853f53..8016d89 100644
>>>>>> --- a/fs/nfs/delegation.c
>>>>>> +++ b/fs/nfs/delegation.c
>>>>>> @@ -394,7 +394,7 @@ static int nfs_end_delegation_return(struct inode *inode, struct nfs_delegation
>>>>>>                 err = nfs4_wait_clnt_recover(clp);
>>>>>>         } while (err == 0);
>>>>>>
>>>>>> -       if (err) {
>>>>>> +       if (err && err != -EIO) {
>>>>>>                 nfs_abort_delegation_return(delegation, clp);
>>>>
>>>> This exception for EIO now has me worried. If we detach the
>>>> delegation, then it looks to me as if we will never send a
>>>> FREE_STATEID, as required for the case of NFSv4.1.
>>>
>>> I'd need to brush up on the FREE_STATEID op to confidently answer
>>> that. But can't this be taken care of in state_recovery when locks are
>>> marked lost?
>>
>> No, because in the case of EIO we now fall through to
>> nfs_detach_delegation() and nfs_do_return_delegation(). What we really
>> want to do is call TEST_STATEID + FREE_STATEID (see RFC5661 sections
>> 8.2.4, 8.5 and 18.38.3), which will now be done as part of state
>> recovery.
>
> I don't see it. :-/ Is it because there is a race between returning a
> delegation and state recovery that was initiated? Because as far as I
> can see, the stateid recovery initiated by the
> handle_delegation_recall_error() will call nfs4_test_stateid() via
> nfs41_open_expire().
>

No. My point is that once we call nfs_detach_delegation(), then nobody
will be able to find the delegation from either the struct nfs_inode
or from the server->delegations list.
IOW: the question is how would the state recovery thread find it in
the first place?

-- 
Trond Myklebust

Linux NFS client maintainer, PrimaryData

trond.myklebust@primarydata.com

  reply	other threads:[~2014-11-06 19:16 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-11-03 14:36 [PATCH 1/1] Fixing infinite state recovery loop due to failed delegation return Olga Kornievskaia
2014-11-04 23:22 ` Olga Kornievskaia
2014-11-05  0:14   ` Trond Myklebust
2014-11-05 17:53     ` Olga Kornievskaia
2014-11-05 18:38       ` Trond Myklebust
2014-11-06 18:54         ` Olga Kornievskaia
2014-11-06 19:15           ` Trond Myklebust [this message]
  -- strict thread matches above, loose matches on Subject: below --
2014-10-24 20:13 Olga Kornievskaia

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAHQdGtQw_7tWuGEZOwdUWN6TEaTn85kkmAQb_nPF5QaTrk487w@mail.gmail.com \
    --to=trond.myklebust@primarydata.com \
    --cc=aglo@umich.edu \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.