All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Benjamin Coddington" <bcodding@redhat.com>
To: "Trond Myklebust" <trondmy@primarydata.com>
Cc: "Schumaker Anna" <anna.schumaker@netapp.com>,
	"List Linux NFS Mailing" <linux-nfs@vger.kernel.org>,
	"Oleg Drokin" <green@linuxhacker.ru>
Subject: Re: [PATCH v7 13/31] NFSv4.1: Ensure we always run TEST/FREE_STATEID on locks
Date: Tue, 08 Nov 2016 10:10:08 -0500	[thread overview]
Message-ID: <B41EEA6E-B168-4DE3-AC73-F4270287EF0D@redhat.com> (raw)
In-Reply-To: <6ABCDB9B-997E-49C1-9363-D59AF9BEC0E9@primarydata.com>



On 7 Nov 2016, at 9:59, Trond Myklebust wrote:

> On Nov 7, 2016, at 09:50, Benjamin Coddington 
> <bcodding@redhat.com<mailto:bcodding@redhat.com>> wrote:
>
> On 7 Nov 2016, at 8:45, Benjamin Coddington wrote:
>
> On 7 Nov 2016, at 8:09, Benjamin Coddington wrote:
>
> On 4 Nov 2016, at 12:02, Benjamin Coddington wrote:
>
> Hi Trond,
>
> On 22 Sep 2016, at 13:39, Trond Myklebust wrote:
>
> Right now, we're only running TEST/FREE_STATEID on the locks if
> the open stateid recovery succeeds. The protocol requires us to
> always do so.
> The fix would be to move the call to TEST/FREE_STATEID and do it
> before we attempt open recovery.
>
> Signed-off-by: Trond Myklebust 
> <trond.myklebust@primarydata.com<mailto:trond.myklebust@primarydata.com>>
> ---
> fs/nfs/nfs4proc.c | 92 
> +++++++++++++++++++++++++++++++------------------------
> 1 file changed, 52 insertions(+), 40 deletions(-)
>
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index 3c1b8cb7dd95..33ca6d768bd2 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -2486,6 +2486,45 @@ static void 
> nfs41_check_delegation_stateid(struct nfs4_state *state)
> }
>
> /**
> + * nfs41_check_expired_locks - possibly free a lock stateid
> + *
> + * @state: NFSv4 state for an inode
> + *
> + * Returns NFS_OK if recovery for this stateid is now finished.
> + * Otherwise a negative NFS4ERR value is returned.
> + */
> +static int nfs41_check_expired_locks(struct nfs4_state *state)
> +{
> + int status, ret = NFS_OK;
> + struct nfs4_lock_state *lsp;
> + struct nfs_server *server = NFS_SERVER(state->inode);
> +
> + if (!test_bit(LK_STATE_IN_USE, &state->flags))
> + goto out;
> + list_for_each_entry(lsp, &state->lock_states, ls_locks) {
> + if (test_bit(NFS_LOCK_INITIALIZED, &lsp->ls_flags)) {
>
> I bisected a crash to this patch (commit 
> c5896fc8622d57b31e1e98545d67d7089019e478).
> I thought the problem was that this patch moved this path out from 
> under the
> nfsi->rwsem in nfs4_reclaim_locks() so it ends up with a freed
> nfs4_lock_state here.
>
> I can reproduce this with generic/089.  Any ideas?
>
> Hit this on v4.9-rc4 this morning.  This probably needs to take the
> state_lock before traversing the lock_states list.  I guess we've 
> never hit
> this before because the old path would serialize things somehow - 
> maybe via
> taking flc_lock in nfs4_reclaim_locks()..   I'll test that fix.
>
> Well, that's no good either as it gets stuck in a NFS4ERR_OLD_STATEID 
> loop
> in recovery since we'd want to retry in that case, but taking the 
> state_lock
> means we won't use the new stateid.  So maybe we need both the 
> state_lock to
> protect the list and the rwsem to stop new locks from being sent.  
> I'll try
> that now.
>
> That one got much further, but eventually soft-locked up on the 
> state_lock
> when what looks like the state manager needed to have a TEST_STATEID 
> wait on
> another lock to complete.
>
> The other question here is why are we doing recovery so much?  It 
> seems like
> we're sending FREE_STATEID unnecessarily on successful DELEGRETURN and
> LOCKU, but that shouldn't be triggering state recovery..
>
> FREE_STATEID is required by the protocol after LOCKU, so that’s 
> intentional.

I thought it wasn't required if we do CLOSE, but I checked again and 
that
wasn't what I was seeing. I am seeing LOCKU, FREE_STATEID, CLOSE, so it 
is
correct.

> It isn’t needed after DELEGRETURN, so I’m not sure why that is 
> happening.

I think it's just falling through the case statement in
nfs4_delegreturn_done().  I'll send a patch for that.

I think the fix here is to manually increment ls_count for the current 
lock
state, and expect that the lock_states list can be modified while we 
walk
it.  I'll send a patch for that too if it runs though testing OK.

Ben

  reply	other threads:[~2016-11-08 15:10 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-22 17:38 [PATCH v7 00/31] Fix delegation behaviour when server revokes some state Trond Myklebust
2016-09-22 17:38 ` [PATCH v7 01/31] NFSv4.1: Don't deadlock the state manager on the SEQUENCE status flags Trond Myklebust
2016-09-22 17:38   ` [PATCH v7 02/31] NFS: Fix inode corruption in nfs_prime_dcache() Trond Myklebust
2016-09-22 17:38     ` [PATCH v7 03/31] NFSv4: Don't report revoked delegations as valid in nfs_have_delegation() Trond Myklebust
2016-09-22 17:38       ` [PATCH v7 04/31] NFSv4: nfs4_copy_delegation_stateid() must fail if the delegation is invalid Trond Myklebust
2016-09-22 17:38         ` [PATCH v7 05/31] NFSv4.1: Don't check delegations that are already marked as revoked Trond Myklebust
2016-09-22 17:38           ` [PATCH v7 06/31] NFSv4.1: Allow test_stateid to handle session errors without waiting Trond Myklebust
2016-09-22 17:38             ` [PATCH v7 07/31] NFSv4.1: Add a helper function to deal with expired stateids Trond Myklebust
2016-09-22 17:38               ` [PATCH v7 08/31] NFSv4.x: Allow callers of nfs_remove_bad_delegation() to specify a stateid Trond Myklebust
2016-09-22 17:38                 ` [PATCH v7 09/31] NFSv4.1: Test delegation stateids when server declares "some state revoked" Trond Myklebust
2016-09-22 17:39                   ` [PATCH v7 10/31] NFSv4.1: Deal with server reboots during delegation expiration recovery Trond Myklebust
2016-09-22 17:39                     ` [PATCH v7 11/31] NFSv4.1: Don't recheck delegations that have already been checked Trond Myklebust
2016-09-22 17:39                       ` [PATCH v7 12/31] NFSv4.1: Allow revoked stateids to skip the call to TEST_STATEID Trond Myklebust
2016-09-22 17:39                         ` [PATCH v7 13/31] NFSv4.1: Ensure we always run TEST/FREE_STATEID on locks Trond Myklebust
2016-09-22 17:39                           ` [PATCH v7 14/31] NFSv4.1: FREE_STATEID can be asynchronous Trond Myklebust
2016-09-22 17:39                             ` [PATCH v7 15/31] NFSv4.1: Ensure we call FREE_STATEID if needed on close/delegreturn/locku Trond Myklebust
2016-09-22 17:39                               ` [PATCH v7 16/31] NFSv4: Ensure we don't re-test revoked and freed stateids Trond Myklebust
2016-09-22 17:39                                 ` [PATCH v7 17/31] NFSv4: nfs_inode_find_state_and_recover() should check all stateids Trond Myklebust
2016-09-22 17:39                                   ` [PATCH v7 18/31] NFSv4: nfs4_handle_delegation_recall_error() handle expiration as revoke case Trond Myklebust
2016-09-22 17:39                                     ` [PATCH v7 19/31] NFSv4: nfs4_handle_setlk_error() " Trond Myklebust
2016-09-22 17:39                                       ` [PATCH v7 20/31] NFSv4.1: nfs4_layoutget_handle_exception handle revoked state Trond Myklebust
2016-09-22 17:39                                         ` [PATCH v7 21/31] NFSv4: Pass the stateid to the exception handler in nfs4_read/write_done_cb Trond Myklebust
2016-09-22 17:39                                           ` [PATCH v7 22/31] NFSv4: Fix a race in nfs_inode_reclaim_delegation() Trond Myklebust
2016-09-22 17:39                                             ` [PATCH v7 23/31] NFSv4: Fix a race when updating an open_stateid Trond Myklebust
2016-09-22 17:39                                               ` [PATCH v7 24/31] NFS: Always call nfs_inode_find_state_and_recover() when revoking a delegation Trond Myklebust
2016-09-22 17:39                                                 ` [PATCH v7 25/31] NFSv4: nfs4_do_handle_exception() handle revoke/expiry of a single stateid Trond Myklebust
2016-09-22 17:39                                                   ` [PATCH v7 26/31] NFSv4: Don't test open_stateid unless it is set Trond Myklebust
2016-09-22 17:39                                                     ` [PATCH v7 27/31] NFSv4: Mark the lock and open stateids as invalid after freeing them Trond Myklebust
2016-09-22 17:39                                                       ` [PATCH v7 28/31] NFSv4: Open state recovery must account for file permission changes Trond Myklebust
2016-09-22 17:39                                                         ` [PATCH v7 29/31] NFSv4: Fix retry issues with nfs41_test/free_stateid Trond Myklebust
2016-09-22 17:39                                                           ` [PATCH v7 30/31] NFSv4: If recovery failed for a specific open stateid, then don't retry Trond Myklebust
2016-09-22 17:39                                                             ` [PATCH v7 31/31] NFSv4.1: Even if the stateid is OK, we may need to recover the open modes Trond Myklebust
2016-10-14 12:50                                               ` [PATCH v7 23/31] NFSv4: Fix a race when updating an open_stateid Christoph Hellwig
2016-11-04 16:02                           ` [PATCH v7 13/31] NFSv4.1: Ensure we always run TEST/FREE_STATEID on locks Benjamin Coddington
2016-11-07 13:09                             ` Benjamin Coddington
2016-11-07 13:45                               ` Benjamin Coddington
2016-11-07 14:50                                 ` Benjamin Coddington
2016-11-07 14:59                                   ` Trond Myklebust
2016-11-08 15:10                                     ` Benjamin Coddington [this message]
2016-11-08 15:20                                       ` Trond Myklebust
2016-11-10 15:01                                       ` Anna Schumaker
2016-11-10 15:58                                         ` Benjamin Coddington
2016-11-10 16:51                                           ` Trond Myklebust
2016-11-10 20:18                                           ` Benjamin Coddington
2016-11-10 20:54                                             ` Anna Schumaker
2016-09-24 20:38 ` [PATCH v7 00/31] Fix delegation behaviour when server revokes some state Oleg Drokin
2016-09-26 20:23 ` Oleg Drokin
     [not found]   ` <A84EB639-97C3-4517-A92F-3A4176A7F916@primarydata.com>
2016-09-26 21:03     ` Oleg Drokin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=B41EEA6E-B168-4DE3-AC73-F4270287EF0D@redhat.com \
    --to=bcodding@redhat.com \
    --cc=anna.schumaker@netapp.com \
    --cc=green@linuxhacker.ru \
    --cc=linux-nfs@vger.kernel.org \
    --cc=trondmy@primarydata.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.