From: "J. Bruce Fields" <bfields@fieldses.org>
To: Manjunath Patil <manjunath.b.patil@oracle.com>
Cc: linux-nfs@vger.kernel.org
Subject: Re: [PATCH 2/2] nfsd: return ENOSPC if unable to allocate a session slot
Date: Mon, 25 Jun 2018 18:04:00 -0400 [thread overview]
Message-ID: <20180625220400.GE8293@fieldses.org> (raw)
In-Reply-To: <bde64edc-5684-82d7-4488-e2ebdd7018fc@oracle.com>
On Mon, Jun 25, 2018 at 10:17:21AM -0700, Manjunath Patil wrote:
> Hi Bruce,
>
> I could reproduce this issue by lowering the amount of RAM. On my
> virtual box VM with 176M MB of RAM I can reproduce this with 3
> clients.
I know how to reproduce it, I was just wondering what motivated it--were
customers hitting it (how), was it just artificial testing?
Oh well, it probably needs to be fixed regardless.
--b.
> My kernel didn't have the following fixes -
>
> de766e5 nfsd: give out fewer session slots as limit approaches
> 44d8660 nfsd: increase DRC cache limit
>
> Once I apply these patches, the issue recurs with 10+ clients.
> Once the mount starts to hang due to this issue, a NFSv4.0 still succeeds.
>
> I took the latest mainline kernel [4.18.0-rc1] and made the server
> return NFS4ERR_DELAY[nfserr_jukebox] if its unable to allocate 50
> slots[just to accelerate the issue]
>
> - if (!ca->maxreqs)
> + if (ca->maxreqs < 50) {
> ...
> return nfserr_jukebox;
>
> Then used the same client[4.18.0-rc1] and observed that mount calls
> still hangs[indefinitely].
> Typically the client hangs here - [stack are from oracle kernel] -
>
> [root@OL7U5-work ~]# ps -ef | grep mount
> root 2032 1732 0 09:49 pts/0 00:00:00 strace -tttvf -o
> /tmp/a.out mount 10.211.47.123:/exports /NFSMNT -vvv -o retry=1
> root 2034 2032 0 09:49 pts/0 00:00:00 mount
> 10.211.47.123:/exports /NFSMNT -vvv -o retry=1
> root 2035 2034 0 09:49 pts/0 00:00:00 /sbin/mount.nfs
> 10.211.47.123:/exports /NFSMNT -v -o rw,retry=1
> root 2039 1905 0 09:49 pts/1 00:00:00 grep --color=auto mount
> [root@OL7U5-work ~]# cat /proc/2035/stack
> [<ffffffffa05204d2>] nfs_wait_client_init_complete+0x52/0xc0 [nfs]
> [<ffffffffa05872ed>] nfs41_discover_server_trunking+0x6d/0xb0 [nfsv4]
> [<ffffffffa0587802>] nfs4_discover_server_trunking+0x82/0x2e0 [nfsv4]
> [<ffffffffa058f8d6>] nfs4_init_client+0x136/0x300 [nfsv4]
> [<ffffffffa05210bf>] nfs_get_client+0x24f/0x2f0 [nfs]
> [<ffffffffa058eeef>] nfs4_set_client+0x9f/0xf0 [nfsv4]
> [<ffffffffa059039e>] nfs4_create_server+0x13e/0x3b0 [nfsv4]
> [<ffffffffa05881b2>] nfs4_remote_mount+0x32/0x60 [nfsv4]
> [<ffffffff8121df3e>] mount_fs+0x3e/0x180
> [<ffffffff8123a6db>] vfs_kern_mount+0x6b/0x110
> [<ffffffffa05880d6>] nfs_do_root_mount+0x86/0xc0 [nfsv4]
> [<ffffffffa05884c4>] nfs4_try_mount+0x44/0xc0 [nfsv4]
> [<ffffffffa052ed6b>] nfs_fs_mount+0x4cb/0xda0 [nfs]
> [<ffffffff8121df3e>] mount_fs+0x3e/0x180
> [<ffffffff8123a6db>] vfs_kern_mount+0x6b/0x110
> [<ffffffff8123d5c1>] do_mount+0x251/0xcf0
> [<ffffffff8123e3a2>] SyS_mount+0xa2/0x110
> [<ffffffff81751f4b>] tracesys_phase2+0x6d/0x72
> [<ffffffffffffffff>] 0xffffffffffffffff
>
> [root@OL7U5-work ~]# cat /proc/2034/stack
> [<ffffffff8108c147>] do_wait+0x217/0x2a0
> [<ffffffff8108d360>] do_wait4+0x80/0x110
> [<ffffffff8108d40d>] SyS_wait4+0x1d/0x20
> [<ffffffff81751f4b>] tracesys_phase2+0x6d/0x72
> [<ffffffffffffffff>] 0xffffffffffffffff
>
> [root@OL7U5-work ~]# cat /proc/2032/stack
> [<ffffffff8108c147>] do_wait+0x217/0x2a0
> [<ffffffff8108d360>] do_wait4+0x80/0x110
> [<ffffffff8108d40d>] SyS_wait4+0x1d/0x20
> [<ffffffff81751ddc>] system_call_fastpath+0x18/0xd6
> [<ffffffffffffffff>] 0xffffffffffffffff
>
> -Thanks,
> Manjunath
> On 6/24/2018 1:26 PM, J. Bruce Fields wrote:
> >By the way, could you share some more details with us about the
> >situation when you (or your customers) are actually hitting this case?
> >
> >How many clients, what kind of clients, etc. And what version of the
> >server were you seeing the problem on? (I'm mainly curious whether
> >de766e570413 and 44d8660d3bb0 were already applied.)
> >
> >I'm glad we're thinking about how to handle this case, but my feeling is
> >that the server is probably just being *much* too conservative about
> >these allocations, and the most important thing may be to fix that and
> >make it a lot rarer that we hit this case in the first place.
> >
> >--b.
> >--
> >To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> >the body of a message to majordomo@vger.kernel.org
> >More majordomo info at http://vger.kernel.org/majordomo-info.html
>
next prev parent reply other threads:[~2018-06-25 22:04 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-06-21 16:35 [PATCH 1/2] nfsv4: handle ENOSPC during create session Manjunath Patil
2018-06-21 16:35 ` [PATCH 2/2] nfsd: return ENOSPC if unable to allocate a session slot Manjunath Patil
2018-06-22 17:54 ` J. Bruce Fields
2018-06-22 21:49 ` Chuck Lever
2018-06-22 22:31 ` Trond Myklebust
2018-06-22 23:10 ` Trond Myklebust
2018-06-23 19:00 ` Chuck Lever
2018-06-24 13:56 ` Trond Myklebust
2018-06-25 15:39 ` Chuck Lever
2018-06-25 16:45 ` Trond Myklebust
2018-06-25 17:03 ` Manjunath Patil
2018-06-24 20:26 ` J. Bruce Fields
[not found] ` <bde64edc-5684-82d7-4488-e2ebdd7018fc@oracle.com>
2018-06-25 22:04 ` J. Bruce Fields [this message]
2018-06-26 17:20 ` Manjunath Patil
2018-07-09 14:25 ` J. Bruce Fields
2018-07-09 21:57 ` Trond Myklebust
2018-06-21 17:04 ` [PATCH 1/2] nfsv4: handle ENOSPC during create session Trond Myklebust
2018-06-22 14:28 ` Manjunath Patil
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180625220400.GE8293@fieldses.org \
--to=bfields@fieldses.org \
--cc=linux-nfs@vger.kernel.org \
--cc=manjunath.b.patil@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).