All of lore.kernel.org
 help / color / mirror / Atom feed
From: bfields@fieldses.org (J. Bruce Fields)
To: Jason L Tibbitts III <tibbs@math.uh.edu>
Cc: linux-nfs@vger.kernel.org
Subject: Re: NFS: nfs4_reclaim_open_state: Lock reclaim failed! log spew
Date: Thu, 25 Feb 2016 14:58:27 -0500	[thread overview]
Message-ID: <20160225195827.GC23315@fieldses.org> (raw)
In-Reply-To: <ufafuwhlr72.fsf@epithumia.math.uh.edu>

On Wed, Feb 24, 2016 at 03:43:45PM -0600, Jason L Tibbitts III wrote:
> My NFS infrastructure has servers running current RHEL7.2 (mostly kernel
> 3.10.0-327.4.5.el7 with a one-line patch needed to fix a soft lockup in
> nfs4_laundromat) and clients running current Fedora 23
> (4.3.5-300.fc23.x86_64).  Everything is mounted NFS4.1 with sec=krb5p.
> 
> Occasionally a client will get into a state where it just hammers the
> server with network traffic, sometimes at full line rate, with:
> 
> NFS: nfs4_reclaim_open_state: Lock reclaim failed!
> 
> spewed to the log about 500 times a second.  The load goes up quite a
> bit (to 5-7 or so).  The machine isn't doing anything and there isn't
> even a user logged in.  However, there are always a few user processes
> hanging around, usually kwin_x11 for whatever reason.  (My guess is
> because of a lock on ~/.Xauthority.)
> 
> When I kill those user processes, this is logged once:
> 
> NFS: nfs4_reclaim_open_state: unhandled error -10068
> 
> -10068 is NFS4ERR_RETRY_UNCACHED_REP.

The only place the server sets that error is in
fs/nfsd/nfs4state.c:nfsd4_enc_sequence_replay.

If the server's correct, then the client attempted to resend a request
that the server was not required to cache.  In which case
NFS4ERR_RETRY_UNCACHED_REP is a valid error, and the client should give
up (or retry with a new slot/seqid?).

In any case, something's wrong with the 4.1 reply caching logic on
client or server.....

> Unfortunately I did not grab any of that traffic (I just wanted it to
> stop).  This happens to me periodically so I'll be sure to do that when
> it hits again.

OK, that'd be helpful.  Unfortunately what would probably be *most*
helpful would be the traffic that lead up to this--by the time the
client and server get into this loop the interesting problem may have
already happened--but just seeing the loop may be useful too.

--b.

> One theory is that this is related to a user's kerberos ticket
> expiring.  I see some hits when I search for the line that's spewed, but
> they're either not recent or or weren't reproducible.  I don't find any
> hits for that specific unhandled error.
> 
>  - J<
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2016-02-25 19:58 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-24 21:43 NFS: nfs4_reclaim_open_state: Lock reclaim failed! log spew Jason L Tibbitts III
2016-02-25 19:58 ` J. Bruce Fields [this message]
2016-02-29 23:06   ` Jason L Tibbitts III
2016-03-01  0:48     ` J. Bruce Fields
2016-03-01  0:53       ` Jason L Tibbitts III
2016-03-01  1:01         ` J. Bruce Fields
2016-03-01  1:03           ` Jason L Tibbitts III
2016-11-16 20:55             ` Jason L Tibbitts III
2016-11-17 16:31               ` J. Bruce Fields
2016-11-17 17:08                 ` Jason L Tibbitts III
2016-11-17 20:22                   ` Andrew W Elble
2016-11-17 17:45                 ` Trond Myklebust
2016-11-17 19:32                   ` bfields
2016-11-17 19:58                     ` Olga Kornievskaia
2016-11-17 20:17                       ` bfields
2016-11-17 20:29                         ` Olga Kornievskaia
2016-11-17 20:46                           ` bfields
2016-11-17 21:05                             ` Olga Kornievskaia
2016-11-17 21:26                               ` bfields
2016-11-17 21:45                                 ` Trond Myklebust
2016-11-17 21:53                                   ` Olga Kornievskaia
2016-11-17 22:15                                     ` Trond Myklebust
2016-11-17 22:27                                       ` Olga Kornievskaia
2016-11-17 22:43                                         ` Trond Myklebust
2016-11-18 20:52                                           ` bfields
2016-11-18 22:44                                             ` Trond Myklebust
2016-11-21 18:37                                               ` Fields Bruce James

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160225195827.GC23315@fieldses.org \
    --to=bfields@fieldses.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=tibbs@math.uh.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.