All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dan Aloni <dan@kernelim.com>
To: Trond Myklebust <trondmy@hammerspace.com>
Cc: "smayhew@redhat.com" <smayhew@redhat.com>,
	"linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>
Subject: Re: NFS v3 soft mount semantics affected by commit ce368536d
Date: Wed, 2 Dec 2020 16:45:06 +0200	[thread overview]
Message-ID: <20201202144506.GA1257606@gmail.com> (raw)
In-Reply-To: <dc888c162b3a30cd1c617072ae606d9d8c6d42f3.camel@hammerspace.com>

On Thu, Nov 26, 2020 at 01:48:23PM +0000, Trond Myklebust wrote:
> On Thu, 2020-11-26 at 12:47 +0200, Dan Aloni wrote:
> > Hi Scott, Trond,
> > 
> > Commit ce368536dd614452407dc31e2449eb84681a06af ("nfs:
> > nfs_file_write()
> > should check for writeback errors") seems to have affected NFS v3
> > soft
> > mount behavior, causing applications to fail on a slow band
> > connection
> > with a properly functioning server. I checked this with recent Linux
> > 5.10-rc5, and on 5.8.18 to where this commit is backported.
> > 
> > Question: while the NFS v4 protocol talks about a soft mount timeout
> > behavior at "RFC7530 section 3.1.1" (see reference and patchset
> > addressing it in [1]), is it valid to assume that a similar guarantee
> > for NFS v3 soft mounts is expected?
> > 
> > The reason why it is important, is because the fulfilment of this
> > guarantee seemed to have changed with this recent patch.
> > 
> > Details on reproduction - using the following mount option:
> > 
> >    
> > vers=3,rsize=1048576,wsize=1048576,soft,proto=tcp,timeo=50,retrans=16
> 
> Sorry, but those are completely silly timeo and retrans values for a
> TCP connection. I see no reason why we should try to support them.

The same issue is reproducible with a similar majortimeo effect, for
example timeo=400,retrans=1.

Now looking under `/sys/kernel/debug`, what I see is an accumulation of
RPC tasks that are ready to transmit, by the thousands, and so if the
outgoing throughput constraint is such that the amount of WRITE backlog
is bigger than what is possible to transmit in the time frame of the
majortimeo, the tasks end with EIO.  This may sound contrived, but it is
achievable with network interfaces of regular throughput, given enough
writers.

This was not the case prior to Linux v5.1, according to my observation -
with the older sunrpc implementation, these tasks would have waited
under 'reserved' state, not incurring a timeout calculation on them at
all, and the behavior was that tasks move to the transmit stage and
start counting down to a timeout only when there's write space on the
socket that allows to transmit them.

I looked around and saw that many vendors are recommending to change the
`sunrpc.tcp_max_slot_table_entries` sysctl to 128 down from 65536. This
has the effect that the transmit queue would be small instead of growing
to the tens of thousands of tasks, keeping the remaining tasks in the
backlog without failure. With the older SunRPC, the 65536 maximum did
not matter due to write space restriction, which 'naturally' did that.

And indeed, the lower setting is able to fix the issue I originally
addressed and help to retain the old behavior, where soft mount's goal
(at least in my case) is to detect EIOs that are stuck at the server
rather than at the client.

-- 
Dan Aloni

      parent reply	other threads:[~2020-12-02 14:46 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-26 10:47 NFS v3 soft mount semantics affected by commit ce368536d Dan Aloni
2020-11-26 13:48 ` Trond Myklebust
2020-11-26 15:06   ` Trond Myklebust
2020-11-26 17:21   ` Chuck Lever
2020-12-02 14:45   ` Dan Aloni [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201202144506.GA1257606@gmail.com \
    --to=dan@kernelim.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=smayhew@redhat.com \
    --cc=trondmy@hammerspace.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.