All of lore.kernel.org
 help / color / mirror / Atom feed
From: Anna Schumaker <Anna.Schumaker@netapp.com>
To: Benjamin Coddington <bcodding@redhat.com>
Cc: Trond Myklebust <trondmy@primarydata.com>,
	List Linux NFS Mailing <linux-nfs@vger.kernel.org>,
	Oleg Drokin <green@linuxhacker.ru>
Subject: Re: [PATCH v7 13/31] NFSv4.1: Ensure we always run TEST/FREE_STATEID on locks
Date: Thu, 10 Nov 2016 15:54:46 -0500	[thread overview]
Message-ID: <50b6aeb9-cb21-9f46-dadd-e7ba0f5d86ed@Netapp.com> (raw)
In-Reply-To: <BDCCD810-781F-4DD6-91E8-279A2C3377EF@redhat.com>

On 11/10/2016 03:18 PM, Benjamin Coddington wrote:
> 
> On 10 Nov 2016, at 10:58, Benjamin Coddington wrote:
> 
>> Hi Anna,
>>
>> On 10 Nov 2016, at 10:01, Anna Schumaker wrote:
>>> Do you have an estimate for when this patch will be ready?  I want to include it in my next bugfix pull request for 4.9.
>>
>> I haven't posted because I am still trying to get to the bottom of another
>> problem where the client gets stuck in a loop sending the same stateid over
>> and over on NFS4ERR_OLD_STATEID.  I want to make sure this problem isn't
>> caused by this fix -- which I don't think it is, but I'd rather make sure.
>> If I don't make any progress on this problem by the end of today, I'll post
>> what I have.
>>
>> Read on if interested in this new problem:
>>
>> It looks like racing opens with the same openowner can be returned out of
>> order by the server, so the client sees stateid seqid of 2 before 1.  Then a
>> LOCK sent with seqid 1 is endlessly retried if sent while doing recovery.
>>
>> It's hard to tell if I was able to capture all the moving parts to describe
>> this problem, though.  As it takes a very long time for me to reproduce, and
>> the packet captures were dropping frames.  I'm working on manually
>> reproducing it now.
> 
> Anna,
> 
> I haven't gotten to the bottom of it, and so I'm not confident it isn't a
> problem created by the fix I've been testing, which is:
> 
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index e809498..2aa9d86 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -2564,12 +2564,15 @@ static void nfs41_check_delegation_stateid(struct
> nfs4_state *state)
>  static int nfs41_check_expired_locks(struct nfs4_state *state)
>  {
>         int status, ret = NFS_OK;
> -       struct nfs4_lock_state *lsp;
> +       struct nfs4_lock_state *lsp, *tmp;
>         struct nfs_server *server = NFS_SERVER(state->inode);
> 
>         if (!test_bit(LK_STATE_IN_USE, &state->flags))
>                 goto out;
> -       list_for_each_entry(lsp, &state->lock_states, ls_locks) {
> +       spin_lock(&state->state_lock);
> +       list_for_each_entry_safe(lsp, tmp, &state->lock_states, ls_locks) {
> +               atomic_inc(&lsp->ls_count);
> +               spin_unlock(&state->state_lock);
>                 if (test_bit(NFS_LOCK_INITIALIZED, &lsp->ls_flags)) {
>                         struct rpc_cred *cred =
> lsp->ls_state->owner->so_cred;
> 
> @@ -2588,7 +2591,10 @@ static int nfs41_check_expired_locks(struct
> nfs4_state *state)
>                                 break;
>                         }
>                 }
> -       };
> +               nfs4_put_lock_state(lsp);
> +               spin_lock(&state->state_lock);
> +       }
> +       spin_unlock(&state->state_lock);
>  out:
>         return ret;
>  }
> 
> http://people.redhat.com/bcodding/old_stateid_loop is tshark output of my
> only good wirecapture of the problem.  Without this patch, generic/089
> crashes long before this problem is reproduced, so I am stuck figuring it
> out, I'm afraid.  Don't wait on my account.
> 
> I plan on trying a bit more to reproduce tomorrow, and if I cannot, I'll
> write about it under separate cover.

Sounds good.  Thanks for the update!

Anna

> 
> Ben

  reply	other threads:[~2016-11-10 20:54 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-22 17:38 [PATCH v7 00/31] Fix delegation behaviour when server revokes some state Trond Myklebust
2016-09-22 17:38 ` [PATCH v7 01/31] NFSv4.1: Don't deadlock the state manager on the SEQUENCE status flags Trond Myklebust
2016-09-22 17:38   ` [PATCH v7 02/31] NFS: Fix inode corruption in nfs_prime_dcache() Trond Myklebust
2016-09-22 17:38     ` [PATCH v7 03/31] NFSv4: Don't report revoked delegations as valid in nfs_have_delegation() Trond Myklebust
2016-09-22 17:38       ` [PATCH v7 04/31] NFSv4: nfs4_copy_delegation_stateid() must fail if the delegation is invalid Trond Myklebust
2016-09-22 17:38         ` [PATCH v7 05/31] NFSv4.1: Don't check delegations that are already marked as revoked Trond Myklebust
2016-09-22 17:38           ` [PATCH v7 06/31] NFSv4.1: Allow test_stateid to handle session errors without waiting Trond Myklebust
2016-09-22 17:38             ` [PATCH v7 07/31] NFSv4.1: Add a helper function to deal with expired stateids Trond Myklebust
2016-09-22 17:38               ` [PATCH v7 08/31] NFSv4.x: Allow callers of nfs_remove_bad_delegation() to specify a stateid Trond Myklebust
2016-09-22 17:38                 ` [PATCH v7 09/31] NFSv4.1: Test delegation stateids when server declares "some state revoked" Trond Myklebust
2016-09-22 17:39                   ` [PATCH v7 10/31] NFSv4.1: Deal with server reboots during delegation expiration recovery Trond Myklebust
2016-09-22 17:39                     ` [PATCH v7 11/31] NFSv4.1: Don't recheck delegations that have already been checked Trond Myklebust
2016-09-22 17:39                       ` [PATCH v7 12/31] NFSv4.1: Allow revoked stateids to skip the call to TEST_STATEID Trond Myklebust
2016-09-22 17:39                         ` [PATCH v7 13/31] NFSv4.1: Ensure we always run TEST/FREE_STATEID on locks Trond Myklebust
2016-09-22 17:39                           ` [PATCH v7 14/31] NFSv4.1: FREE_STATEID can be asynchronous Trond Myklebust
2016-09-22 17:39                             ` [PATCH v7 15/31] NFSv4.1: Ensure we call FREE_STATEID if needed on close/delegreturn/locku Trond Myklebust
2016-09-22 17:39                               ` [PATCH v7 16/31] NFSv4: Ensure we don't re-test revoked and freed stateids Trond Myklebust
2016-09-22 17:39                                 ` [PATCH v7 17/31] NFSv4: nfs_inode_find_state_and_recover() should check all stateids Trond Myklebust
2016-09-22 17:39                                   ` [PATCH v7 18/31] NFSv4: nfs4_handle_delegation_recall_error() handle expiration as revoke case Trond Myklebust
2016-09-22 17:39                                     ` [PATCH v7 19/31] NFSv4: nfs4_handle_setlk_error() " Trond Myklebust
2016-09-22 17:39                                       ` [PATCH v7 20/31] NFSv4.1: nfs4_layoutget_handle_exception handle revoked state Trond Myklebust
2016-09-22 17:39                                         ` [PATCH v7 21/31] NFSv4: Pass the stateid to the exception handler in nfs4_read/write_done_cb Trond Myklebust
2016-09-22 17:39                                           ` [PATCH v7 22/31] NFSv4: Fix a race in nfs_inode_reclaim_delegation() Trond Myklebust
2016-09-22 17:39                                             ` [PATCH v7 23/31] NFSv4: Fix a race when updating an open_stateid Trond Myklebust
2016-09-22 17:39                                               ` [PATCH v7 24/31] NFS: Always call nfs_inode_find_state_and_recover() when revoking a delegation Trond Myklebust
2016-09-22 17:39                                                 ` [PATCH v7 25/31] NFSv4: nfs4_do_handle_exception() handle revoke/expiry of a single stateid Trond Myklebust
2016-09-22 17:39                                                   ` [PATCH v7 26/31] NFSv4: Don't test open_stateid unless it is set Trond Myklebust
2016-09-22 17:39                                                     ` [PATCH v7 27/31] NFSv4: Mark the lock and open stateids as invalid after freeing them Trond Myklebust
2016-09-22 17:39                                                       ` [PATCH v7 28/31] NFSv4: Open state recovery must account for file permission changes Trond Myklebust
2016-09-22 17:39                                                         ` [PATCH v7 29/31] NFSv4: Fix retry issues with nfs41_test/free_stateid Trond Myklebust
2016-09-22 17:39                                                           ` [PATCH v7 30/31] NFSv4: If recovery failed for a specific open stateid, then don't retry Trond Myklebust
2016-09-22 17:39                                                             ` [PATCH v7 31/31] NFSv4.1: Even if the stateid is OK, we may need to recover the open modes Trond Myklebust
2016-10-14 12:50                                               ` [PATCH v7 23/31] NFSv4: Fix a race when updating an open_stateid Christoph Hellwig
2016-11-04 16:02                           ` [PATCH v7 13/31] NFSv4.1: Ensure we always run TEST/FREE_STATEID on locks Benjamin Coddington
2016-11-07 13:09                             ` Benjamin Coddington
2016-11-07 13:45                               ` Benjamin Coddington
2016-11-07 14:50                                 ` Benjamin Coddington
2016-11-07 14:59                                   ` Trond Myklebust
2016-11-08 15:10                                     ` Benjamin Coddington
2016-11-08 15:20                                       ` Trond Myklebust
2016-11-10 15:01                                       ` Anna Schumaker
2016-11-10 15:58                                         ` Benjamin Coddington
2016-11-10 16:51                                           ` Trond Myklebust
2016-11-10 20:18                                           ` Benjamin Coddington
2016-11-10 20:54                                             ` Anna Schumaker [this message]
2016-09-24 20:38 ` [PATCH v7 00/31] Fix delegation behaviour when server revokes some state Oleg Drokin
2016-09-26 20:23 ` Oleg Drokin
     [not found]   ` <A84EB639-97C3-4517-A92F-3A4176A7F916@primarydata.com>
2016-09-26 21:03     ` Oleg Drokin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50b6aeb9-cb21-9f46-dadd-e7ba0f5d86ed@Netapp.com \
    --to=anna.schumaker@netapp.com \
    --cc=bcodding@redhat.com \
    --cc=green@linuxhacker.ru \
    --cc=linux-nfs@vger.kernel.org \
    --cc=trondmy@primarydata.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.