linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Richard Purdie <richard.purdie@linuxfoundation.org>
To: Trond Myklebust <trondmy@hammerspace.com>,
	"linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>
Cc: "mhalstead@linuxfoundation.org" <mhalstead@linuxfoundation.org>
Subject: Re: TEST_STATEID issues with NFS4.1 and FreeNAS server
Date: Wed, 20 May 2020 19:06:02 +0100	[thread overview]
Message-ID: <34c2810e78b6053b23f4d40c981d5609977e262d.camel@linuxfoundation.org> (raw)
In-Reply-To: <6e7c1125fb5533d1fad5d8b9130761df0fdf3516.camel@hammerspace.com>

On Wed, 2020-05-20 at 18:01 +0000, Trond Myklebust wrote:
> Hi Richard,
> 
> On Wed, 2020-05-20 at 18:47 +0100, Richard Purdie wrote:
> > Hi,
> > 
> > We have a cluster of machines where we're observing file accesses
> > hanging over NFS. The clients showing the problems are Fedora and
> > SUSE
> > distros with the 5.6.11 kernel, e.g.:
> > 
> > Linux version 5.6.11-1-default (geeko@buildhost) (gcc version 9.3.1
> > 20200406 
> > [revision 6db837a5288ee3ca5ec504fbd5a765817e556ac2] (SUSE Linux)) 
> > #1 SMP Wed May 6 10:42:09 UTC 2020 (91c024a)
> > 
> > In the example below we see a git clone hang, its having trouble
> > reading a .pack file off the NFS share, the git process is in D
> > state.
> > I've included part of dmesg below with sysrq-w output.
> > 
> > Mount options:
> > 
> > rw,relatime,vers=4.1,rsize=131072,wsize=131072,namlen=255,hard,proto=
> > tcp,timeo=600,retrans=2,sec=sys,local_lock=none
> > 
> > mountstats shows:
> >  
> > READ:
> > 	632014263 ops (62%) 	629809108 errors (99%) 
> > TEST_STATEID:
> >  	363257078 ops (36%) 	363257078 errors (100%)
> > 
> > which is a clue on what is happening. I grabbed some data with
> > tcpdump
> > and it shows the READ getting NFS4ERR_BAD_STATEID, there is then a
> > TEST_STATEID which gets NFS4ERR_NOTSUPP. This repeats infinitely in a
> > loop.
> > 
> > The server is FreeNAS11.3 which does not have:
> > https://github.com/HardenedBSD/hardenedBSD-stable/commit/63f6f19b0756b18f2e68d82cbe037f21f9a8c500
> > applied so it will return NFS4ERR_NOTSUPP to TEST_STATEID.
> > 
> > I think something may be needed to stop Linux getting into an
> > infinite
> > loop with this, regardless of whether the spec says TEST_STATEID can
> > get a NFS4ERR_NOTSUPP or not?
> > 
> > I freely admit I know little about much of this so I'm open to
> > pointers. If we did remount as 4.0 we probably wouldn't see the issue
> > as it would avoid the TEST_STATEID code.
> 
> TEST_STATEID is listed in RFC5661 Section 17 as REQUIRED to implement
> for NFSv4.1. We will not be able to support a server that violates that
> requirement.

Understood, I suspected as much.

Locking systems into an infinite loop doesn't seem like a good user
experience though. Is there a way to handle that more gracefully?

Cheers,

Richard



  reply	other threads:[~2020-05-20 18:06 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-20 17:47 TEST_STATEID issues with NFS4.1 and FreeNAS server Richard Purdie
2020-05-20 18:01 ` Trond Myklebust
2020-05-20 18:06   ` Richard Purdie [this message]
2020-05-20 18:14     ` Trond Myklebust

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=34c2810e78b6053b23f4d40c981d5609977e262d.camel@linuxfoundation.org \
    --to=richard.purdie@linuxfoundation.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=mhalstead@linuxfoundation.org \
    --cc=trondmy@hammerspace.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).