Linux-NFS Archive on lore.kernel.org
 help / color / Atom feed
From: "Benjamin Coddington" <bcodding@redhat.com>
To: "James Harvey" <jamespharvey20@gmail.com>,
	"Trond Myklebust" <trondmy@hammerspace.com>
Cc: "Linux NFS Mailing List" <linux-nfs@vger.kernel.org>
Subject: Re: 5.3.0 Regression: rpc.nfsd v4 uninterruptible sleep for 5+ minutes w/o rpc-statd/etc
Date: Tue, 01 Oct 2019 14:21:26 -0400
Message-ID: <720574D9-90C7-4A79-8DA6-9A683CFD98CB@redhat.com> (raw)
In-Reply-To: <CA+X5Wn60sGi+za48Lj-y1fcHHw7kdzEUsw8nj+Xc0U90mONz5w@mail.gmail.com>

On 19 Sep 2019, at 9:00, James Harvey wrote:

> For a really long time (years?) if you forced NFS v4 only, you could
> mask a lot of unnecessary services.
>
> In /etc/nfs.conf, in "[nfsd] I've been able to set "vers3=n", and then
> mask the following services:
> * gssproxy
> * nfs-blkmap
> * rpc-statd
> * rpcbind (service & socket)
>
> Upgrading from 5.2.14 to 5.3.0, nfs-server.service (rpc.nfsd) has
> exactly a 5 minute delay, and sometimes longer.

A bisect ends on:
4f8943f80883 SUNRPC: Replace direct task wakeups from softirq context

That commit changed the way we pull the error from the socket, previously
we'd wake the task with whatever error is in sk_err from xs_error_report(),
but now we use SO_ERROR - but that's only after possibly running through
xs_wake_disconnect which forces a closure which can change sk_err.

So, I think xs_error_report sees ECONNREFUSED, but we wake tasks with
ENOTCONN, and the client machine spins us back around again to reconnect, we
do this until things time out.

I'll send a patch to revert to the previous behavior of waking tasks with
the error as it was in xs_error_report by copying it over to the sock_xprt
struct and waking the tasks with that value.

There's another subtle change here besides that race: SO_ERROR can return
the socket's soft error, not just what's in sk_err.  That can be fun things
like EINVAL if routing lookups fail..

Ben

      parent reply index

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-19 13:00 James Harvey
2019-09-19 13:17 ` James Harvey
2019-09-26 19:51   ` bfields
2019-10-01 18:21 ` Benjamin Coddington [this message]

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=720574D9-90C7-4A79-8DA6-9A683CFD98CB@redhat.com \
    --to=bcodding@redhat.com \
    --cc=jamespharvey20@gmail.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=trondmy@hammerspace.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-NFS Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-nfs/0 linux-nfs/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-nfs linux-nfs/ https://lore.kernel.org/linux-nfs \
		linux-nfs@vger.kernel.org
	public-inbox-index linux-nfs

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-nfs


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git