All of lore.kernel.org
 help / color / mirror / Atom feed
From: Olga Kornievskaia <aglo@umich.edu>
To: Benjamin Coddington <bcodding@redhat.com>
Cc: Michael Wakabayashi <mwakabayashi@vmware.com>,
	Trond Myklebust <trondmy@hammerspace.com>,
	linux-nfs <linux-nfs@vger.kernel.org>,
	Steve Dickson <SteveD@redhat.com>
Subject: Re: NFSv4: Mounting NFS server which is down, blocks all other NFS mounts on same machine
Date: Wed, 9 Jun 2021 10:41:10 -0400	[thread overview]
Message-ID: <CAN-5tyF4fsfeuzcbXzyWNfQ3wSx2WDxMtyk+dPUdd7H4nJ8hug@mail.gmail.com> (raw)
In-Reply-To: <FCDAEE4A-33CB-4939-8001-DAAFD7BC8638@redhat.com>

On Wed, Jun 9, 2021 at 10:31 AM Benjamin Coddington <bcodding@redhat.com> wrote:
>
> On 9 Jun 2021, at 1:31, Michael Wakabayashi wrote:
>
> > Hi Olga,
> >
> > There seems to be a discrepancy between what you're seeing and what
> > we're seeing.
> >
> > So we were wondering if you can you please run these commands in your
> > Linux environment and paste the output of the mount command below?
> >     $ sudo mkdir -p /tmp/mnt.dead
> >     $ time sudo mount -o vers=4 -vvv 2.2.2.2:/fake_path /tmp/mnt.dead
> >
> > We'd like the mount command to specifically use "2.2.2.2:/fake_path"
> > since we know it is unreachable and outside your subnet.
> > We're hoping by mounting "2.2.2.2:/fake_path" you'll be able to
> > reproduce the same behavior that we're seeing.
> >
> > Also, if possible, a packet trace would be helpful:
> >     $ sudo tcpdump -s 0 -w /tmp/nfsv4.pcap port 2049
> >
> > On my Ubuntu VirtualMachine, I see this output:
> >     ubuntu@mikes-ubuntu-21-04:~$ time sudo mount -o vers=4 -vvv
> > 2.2.2.2:/fake_path /tmp/mnt.dead
> >     mount.nfs: timeout set for Wed Jun  9 05:12:15 2021
> >     mount.nfs: trying text-based options
> > 'vers=4,addr=2.2.2.2,clientaddr=10.162.132.231'
> >     mount.nfs: mount(2): Connection timed out
> >     mount.nfs: Connection timed out
> >     real  3m1.257s
> >     user  0m0.006s
> >     sys 0m0.007s
> >
> > Thanks, Mike
>
> It looks to me like you and Olga are seeing the same thing, a wait
> through SYN retries scaling up from initial RTO for the number of
> tcp_syn_retries.

Ben, I disagree. Mike and I are seeing different things. Mike is
seeing SYNs being sent. I argue that SYNs should not be sent. I agree
if SYNs are sent then that would cause a problem

> It's not disputed that mounts waiting on the transport layer will block
> other mounts.
>
> It might be able to be changed:  there's this torch:
> https://lore.kernel.org/linux-nfs/87378omld4.fsf@notabene.neil.brown.name/

We already discussed that this is not a solution as the NFS layer has
to serialize the client creation attempts.

> ..or there may be another way we don't have to wait ..
>
> .. or tune tcp_syn_retries.. or RTO.. or something else (eBPF?).
>
> I think we're all strapped for time and problems like this usually get
> fixed by the folks feeling the most pain from them.

I think we are still not understanding what network setup that is
happening that leads to a client sending a SYN (which is incorrect) to
what is supposed to be an unreachable server instead of timing out
fast (because there shouldn't be an ARP entry).

Mike, can you show your arp cache info (arp -n) during your run?

>
> Ben
>

  reply	other threads:[~2021-06-09 14:42 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-17  1:37 NFSv4: Mounting NFS server which is down, blocks all other NFS mounts on same machine Michael Wakabayashi
2021-05-19 19:15 ` Olga Kornievskaia
2021-05-20  9:51   ` Michael Wakabayashi
2021-05-20 10:43     ` Michael Wakabayashi
2021-05-20 23:51       ` Olga Kornievskaia
2021-05-21 19:11         ` Michael Wakabayashi
2021-05-20 18:42   ` Steve Dickson
     [not found]     ` <CO1PR05MB8101FD5E77B386A75786FF41B7299@CO1PR05MB8101.namprd05.prod.outlook.com>
2021-05-21 19:35       ` Olga Kornievskaia
2021-05-21 20:31         ` Michael Wakabayashi
2021-05-21 21:06           ` Olga Kornievskaia
2021-05-21 22:08             ` Trond Myklebust
2021-05-21 22:41               ` Olga Kornievskaia
2021-06-08  9:16                 ` Michael Wakabayashi
2021-06-08 16:10                   ` Olga Kornievskaia
2021-06-09  5:31                     ` Michael Wakabayashi
2021-06-09 13:50                       ` Olga Kornievskaia
2021-06-09 20:19                         ` Alex Romanenko
2021-06-11  5:26                           ` Michael Wakabayashi
2021-06-09 14:31                       ` Benjamin Coddington
2021-06-09 14:41                         ` Olga Kornievskaia [this message]
2021-06-09 17:14                           ` Michael Wakabayashi
2021-06-09 14:41                         ` Trond Myklebust
2021-06-09 15:00                           ` Benjamin Coddington
2021-06-09 15:19                             ` Trond Myklebust
2021-06-09  6:46                     ` Alex Romanenko
2021-05-21 22:38             ` Olga Kornievskaia

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAN-5tyF4fsfeuzcbXzyWNfQ3wSx2WDxMtyk+dPUdd7H4nJ8hug@mail.gmail.com \
    --to=aglo@umich.edu \
    --cc=SteveD@redhat.com \
    --cc=bcodding@redhat.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=mwakabayashi@vmware.com \
    --cc=trondmy@hammerspace.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.