All of lore.kernel.org
 help / color / mirror / Atom feed
* Linux NFS v4.1 server support for dynamic slot allocation?
@ 2019-02-18 21:46 Chris Tracy
  2019-02-20 17:10 ` J. Bruce Fields
  0 siblings, 1 reply; 5+ messages in thread
From: Chris Tracy @ 2019-02-18 21:46 UTC (permalink / raw)
  To: linux-nfs

Hello,

 	Hopefully I'm not missing something obvious, but I'm curious 
whatever happened to the patch series from late 2012 that added dynamic 
v4.1 session slot allocation support to nfsd:

https://www.spinics.net/lists/linux-nfs/msg34390.html

The corresponding nfs client patches were integrated, but the nfsd series 
seems to have been left out due to release timing:

https://www.spinics.net/lists/linux-nfs/msg34505.html

However, they don't seem to ever have been integrated or discussed again. 
Were there other issues that prevented its inclusion in the intervening 
time?

 	Alternatively, is there some admin-tweakable knob for controlling 
the number of slots available per-session on the NFS v4.1 server 
(nfsd.ko), similar to the 'max_session_slots' client-side parameter for 
nfs.ko?

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/fs/nfs?id=ef159e9177cc5a09e6174796dde0b2d243ddf28b

 	I ask because I'm currently standing up a (very) modest HPC 
cluster PoC (1 server, 8 client nodes, all 10Gbit, all running CentOS 7.6) 
and figured that was a good enough excuse to finally move away from NFS v3 
and investigate NFS v4.x.  However, initial performance testing showed 
that while NFS v4.0 was essentially identical to v3, NFS v4.2 (and v4.1) 
were around 25% slower.

 	Looking at the traffic in wireshark, I see that in CREATE_SESSION, 
the client sets ca_maxrequests to 64 (consistent with the value of 
'max_session_slots') but the server always replies with a value of 10 for 
ca_maxreqests.  This seems to be the source of the performance issue, 
since if I fallback to v4.0 or v3, but set nfsd to use only 10 threads in 
nfs.conf, I get roughly equivalent performance to v4.2.

 	Looking at the code (both in CentOS's 3.10.0-957.5.1.el7.x86_64 
and in the 4.20.8 mainline), it seems the value that would need to change 
is the preprocessor define NFSD_CACHE_SIZE_SLOTS_PER_SESSION.  This is 
fixed at 32, and while it's a bit more complex than this, the code in 
nfs4_get_drc_mem (fs/nfsd/nfs4state.c) basically sets the per-client 
session slot limit to '(int)(32/3)' which is where the '10' comes from.

 	This brings me back to the patch series mentioned above as this 
one from it:

https://patchwork.kernel.org/patch/1819971/

seems to allow the per-session limit to dynamically increase at least all 
the way to 32. (instead of being fixed at a max of 10)

 	Is there something else I've missed somewhere that allows 
adjusting the server-side session slot limit to be more than 10 without 
having to compile a custom version of nfsd.ko?

 	Thanks,

 	Chris

---------------------------------
Chris Tracy
System/Network Administrator
School of Engineering
Santa Clara University
"Wherever you go, there you are."

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Linux NFS v4.1 server support for dynamic slot allocation?
  2019-02-18 21:46 Linux NFS v4.1 server support for dynamic slot allocation? Chris Tracy
@ 2019-02-20 17:10 ` J. Bruce Fields
  2019-02-20 19:07   ` Chris Tracy
  0 siblings, 1 reply; 5+ messages in thread
From: J. Bruce Fields @ 2019-02-20 17:10 UTC (permalink / raw)
  To: Chris Tracy; +Cc: linux-nfs

On Mon, Feb 18, 2019 at 01:46:24PM -0800, Chris Tracy wrote:
> 	Hopefully I'm not missing something obvious, but I'm curious
> whatever happened to the patch series from late 2012 that added
> dynamic v4.1 session slot allocation support to nfsd:
> 
> https://www.spinics.net/lists/linux-nfs/msg34390.html
> 
> The corresponding nfs client patches were integrated, but the nfsd
> series seems to have been left out due to release timing:
> 
> https://www.spinics.net/lists/linux-nfs/msg34505.html
> 
> However, they don't seem to ever have been integrated or discussed
> again. Were there other issues that prevented its inclusion in the
> intervening time?

They'd probably need reworking.  The latest discussion I can find is:

	https://lore.kernel.org/linux-nfs/CAABAsM6vDOaudUZYWH23oGiWGqX5Bd1YbCDnL6L=pxzMXgZzaw@mail.gmail.com/

> 	Alternatively, is there some admin-tweakable knob for controlling
> the number of slots available per-session on the NFS v4.1 server
> (nfsd.ko), similar to the 'max_session_slots' client-side parameter
> for nfs.ko?
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/fs/nfs?id=ef159e9177cc5a09e6174796dde0b2d243ddf28b
> 
> 	I ask because I'm currently standing up a (very) modest HPC cluster
> PoC (1 server, 8 client nodes, all 10Gbit, all running CentOS 7.6)
> and figured that was a good enough excuse to finally move away from
> NFS v3 and investigate NFS v4.x.  However, initial performance
> testing showed that while NFS v4.0 was essentially identical to v3,
> NFS v4.2 (and v4.1) were around 25% slower.
> 
> 	Looking at the traffic in wireshark, I see that in CREATE_SESSION,
> the client sets ca_maxrequests to 64 (consistent with the value of
> 'max_session_slots') but the server always replies with a value of
> 10 for ca_maxreqests.  This seems to be the source of the
> performance issue, since if I fallback to v4.0 or v3, but set nfsd
> to use only 10 threads in nfs.conf, I get roughly equivalent
> performance to v4.2.
> 
> 	Looking at the code (both in CentOS's 3.10.0-957.5.1.el7.x86_64 and
> in the 4.20.8 mainline), it seems the value that would need to
> change is the preprocessor define NFSD_CACHE_SIZE_SLOTS_PER_SESSION.
> This is fixed at 32, and while it's a bit more complex than this,
> the code in nfs4_get_drc_mem (fs/nfsd/nfs4state.c) basically sets
> the per-client session slot limit to '(int)(32/3)' which is where
> the '10' comes from.

Thanks for the report!

I think the limit should only be that low if the client requests very
large slots.  Do your clients have 35c036ef4a72 "nfs: RPC_MAX_AUTH_SIZE
is in bytes" applied?

> 	Is there something else I've missed somewhere that allows adjusting
> the server-side session slot limit to be more than 10 without having
> to compile a custom version of nfsd.ko?

No.  It might be a good idea, though really I think your setup should
just work out of the box.

--b.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Linux NFS v4.1 server support for dynamic slot allocation?
  2019-02-20 17:10 ` J. Bruce Fields
@ 2019-02-20 19:07   ` Chris Tracy
  2019-02-21 16:27     ` J. Bruce Fields
  0 siblings, 1 reply; 5+ messages in thread
From: Chris Tracy @ 2019-02-20 19:07 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs

Bruce,

> They'd probably need reworking.  The latest discussion I can find is:
>
> 	https://lore.kernel.org/linux-nfs/CAABAsM6vDOaudUZYWH23oGiWGqX5Bd1YbCDnL6L=pxzMXgZzaw@mail.gmail.com/

 	Ahh, thanks.  If server-side dynamic slot allocation does get 
added at some point, I'll certainly be interested to test.

>> 	Looking at the code (both in CentOS's 3.10.0-957.5.1.el7.x86_64 and
>> in the 4.20.8 mainline), it seems the value that would need to
>> change is the preprocessor define NFSD_CACHE_SIZE_SLOTS_PER_SESSION.
>> This is fixed at 32, and while it's a bit more complex than this,
>> the code in nfs4_get_drc_mem (fs/nfsd/nfs4state.c) basically sets
>> the per-client session slot limit to '(int)(32/3)' which is where
>> the '10' comes from.
>
> Thanks for the report!
>
> I think the limit should only be that low if the client requests very
> large slots.  Do your clients have 35c036ef4a72 "nfs: RPC_MAX_AUTH_SIZE
> is in bytes" applied?

 	They do, yes.

 	I'm nothing close to a kernel hacker, but the issue seems to come 
down to nfsd4_get_drc_mem().  Yes, it calls slot_bytes() which uses 
ca->maxresp_cached (the size of which is impacted by the referenced patch) 
but the slot size's impact on the number of slots returned seems to pale 
in comparison to this bit in nfsd4_get_drc_mem():

-----------
avail = min((unsigned long)NFSD_MAX_MEM_PER_SESSION,
 		nfsd_drc_max_mem - nfsd_drc_mem_used);
/*
  * Never use more than a third of the remaining memory,
  * unless it's the only way to give this client a slot:
  */
avail = clamp_t(int, avail, slotsize, avail/3);
num = min_t(int, num, avail / slotsize);
-----------

That first min() call seems almost always guaranteed to use 
NFSD_MAX_MEM_PER_SESSION, at least in my scale of testing where the number 
of clients and connections is relatively low.  Since this is defined as:

-----------
/* Maximum  session per slot cache size */
#define NFSD_SLOT_CACHE_SIZE		2048
/* Maximum number of NFSD_SLOT_CACHE_SIZE slots per session */
#define NFSD_CACHE_SIZE_SLOTS_PER_SESSION	32
#define NFSD_MAX_MEM_PER_SESSION  \
 	(NFSD_CACHE_SIZE_SLOTS_PER_SESSION * NFSD_SLOT_CACHE_SIZE)
-----------

NFSD_MAX_MEM_PER_SESSION is 65,536 bytes, and thus that's as big as 
'avail' can ever be.  'slotsize' is the return value of 'slot_bytes(ca)' 
which uses 'ca->maxresp_cached' as referenced above, and at least here 
ends up returning a value of 2128.  So the code then clamp's 'avail' to 
between 2128 and 21845 (65536/3) and then goes on to set 'num' to the 
minimum of the client request (64 in this case) or 10 (21845/2128).

 	Unfortunately, I don't understand the code well enough to suggest 
an alternative approach.  However, it does seem to me that it can 
currently only ever return a maximum of 10 slots, which seems low, 
especially in the low-client, high bandwidth (10G or more) case that I'm 
dealing with.

>> 	Is there something else I've missed somewhere that allows adjusting
>> the server-side session slot limit to be more than 10 without having
>> to compile a custom version of nfsd.ko?
>
> No.  It might be a good idea, though really I think your setup should
> just work out of the box.

 	Out of the box would be great, but I'd be happy with a manual 
knob.  I'm just looking for some way to control the per-client slot count 
on the server side.  (Something as conceptually simple as increasing 
NFSD_CACHE_SIZE_SLOTS_PER_SESSION to 64, removing the /3, and then 
exposing an nfsd 'max_slots_per_session' parameter, capped at 64, would 
work for me, I think)

 	And in fairness, it's not like it's broken out of the box.  I'm 
complaining about single-client read speeds being 600MB/s with NFS v3/v4.0 
but "only" ~440MB/s with NFS v4.1/v4.2.  It would be nice to eventually 
have the same level of performance available, but it's certainly usable.

 	Thanks,

 	Chris

NOTE: Looking at it now, I wonder if the intent of the comment block in 
nfsd4_get_drc_mem() would be better expressed:

--------
avail = min((unsigned long)NFSD_MAX_MEM_PER_SESSION,
 		nfsd_drc_max_mem - nfsd_drc_mem_used);
/*
  * Never use more than a third of the remaining memory,
  * unless it's the only way to give this client a slot:
  */
avail = clamp_t(int, avail, slotsize,
 	(nfsd_drc_max_mem - nfsd_drc_mem_used)/3);
num = min_t(int, num, avail / slotsize);
--------

ensuring that 'avail' can never be more than a third of the DRC memory 
available, rather than a third of NFSD_MAX_MEM_PER_SESSION.  That would at 
least allow each client to use up to 32 slots, which would be a 
significant improvement.  (Though some sort of manual knob or auto-tuning 
would be nice)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Linux NFS v4.1 server support for dynamic slot allocation?
  2019-02-20 19:07   ` Chris Tracy
@ 2019-02-21 16:27     ` J. Bruce Fields
  2019-02-27 20:48       ` Chris Tracy
  0 siblings, 1 reply; 5+ messages in thread
From: J. Bruce Fields @ 2019-02-21 16:27 UTC (permalink / raw)
  To: Chris Tracy; +Cc: linux-nfs

On Wed, Feb 20, 2019 at 11:07:45AM -0800, Chris Tracy wrote:
> 	I'm nothing close to a kernel hacker, but the issue seems to come
> down to nfsd4_get_drc_mem().  Yes, it calls slot_bytes() which uses
> ca->maxresp_cached (the size of which is impacted by the referenced
> patch) but the slot size's impact on the number of slots returned
> seems to pale in comparison to this bit in nfsd4_get_drc_mem():

Gah, yes, that's just a bug.  Thanks for your persistence.

We need some (optional) pynfs CREATE_SESSION tests to confirm that this
is working the way we expect, rather than waiting for people to report
performance regressions....

> 	And in fairness, it's not like it's broken out of the box.  I'm
> complaining about single-client read speeds being 600MB/s with NFS
> v3/v4.0 but "only" ~440MB/s with NFS v4.1/v4.2.

Out of curiosity, is your bandwidth limited by disk at this point?  If
not it might be interesting to compare with recent upstream.

> NOTE: Looking at it now, I wonder if the intent of the comment block
> in nfsd4_get_drc_mem() would be better expressed:
> 
> --------
> avail = min((unsigned long)NFSD_MAX_MEM_PER_SESSION,
> 		nfsd_drc_max_mem - nfsd_drc_mem_used);
> /*
>  * Never use more than a third of the remaining memory,
>  * unless it's the only way to give this client a slot:
>  */
> avail = clamp_t(int, avail, slotsize,
> 	(nfsd_drc_max_mem - nfsd_drc_mem_used)/3);
> num = min_t(int, num, avail / slotsize);
> --------
> 
> ensuring that 'avail' can never be more than a third of the DRC
> memory available, rather than a third of NFSD_MAX_MEM_PER_SESSION.

Yep.  Here's what I've got:

commit c54f24e338ed
Author: J. Bruce Fields <bfields@redhat.com>
Date:   Thu Feb 21 10:47:00 2019 -0500

    nfsd: fix performance-limiting session calculation
    
    We're unintentionally limiting the number of slots per nfsv4.1 session
    to 10.  Often more than 10 simultaneous RPCs are needed for the best
    performance.
    
    This calculation was meant to prevent any one client from using up more
    than a third of the limit we set for total memory use across all clients
    and sessions.  Instead, it's limiting the client to a third of the
    maximum for a single session.
    
    Fix this.
    
    Reported-by: Chris Tracy <ctracy@engr.scu.edu>
    Cc: stable@vger.kernel.org
    Fixes: de766e570413 "nfsd: give out fewer session slots as limit approaches"
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index fb3c9844c82a..6a45fb00c5fc 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1544,16 +1544,16 @@ static u32 nfsd4_get_drc_mem(struct nfsd4_channel_attrs *ca)
 {
 	u32 slotsize = slot_bytes(ca);
 	u32 num = ca->maxreqs;
-	int avail;
+	unsigned long avail, total_avail;
 
 	spin_lock(&nfsd_drc_lock);
-	avail = min((unsigned long)NFSD_MAX_MEM_PER_SESSION,
-		    nfsd_drc_max_mem - nfsd_drc_mem_used);
+	total_avail = nfsd_drc_max_mem - nfsd_drc_mem_used;
+	avail = min((unsigned long)NFSD_MAX_MEM_PER_SESSION, total_avail);
 	/*
 	 * Never use more than a third of the remaining memory,
 	 * unless it's the only way to give this client a slot:
 	 */
-	avail = clamp_t(int, avail, slotsize, avail/3);
+	avail = clamp_t(int, avail, slotsize, total_avail/3);
 	num = min_t(int, num, avail / slotsize);
 	nfsd_drc_mem_used += num * slotsize;
 	spin_unlock(&nfsd_drc_lock);

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: Linux NFS v4.1 server support for dynamic slot allocation?
  2019-02-21 16:27     ` J. Bruce Fields
@ 2019-02-27 20:48       ` Chris Tracy
  0 siblings, 0 replies; 5+ messages in thread
From: Chris Tracy @ 2019-02-27 20:48 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs

Bruce,

 	Sorry for the delayed reply, only just now got some time to come 
back to this and test.

>> 	And in fairness, it's not like it's broken out of the box.  I'm
>> complaining about single-client read speeds being 600MB/s with NFS
>> v3/v4.0 but "only" ~440MB/s with NFS v4.1/v4.2.
>
> Out of curiosity, is your bandwidth limited by disk at this point?  If
> not it might be interesting to compare with recent upstream.

 	Yeah, the 600MB/s is disk limited in this case.

> Yep.  Here's what I've got:

 	Looks good.  Built a custom kernel with that patch and tested it 
on the NFS server.  As before, the client set 'ca_maxrequests' to 64, but 
now the server replies with 31 (instead of 10) so the session is 
established with 31[*] slots.  NFS v4.1/4.2 performance is now more or 
less equivalent to that of NFS v3/4.0 in my tests.

 	Hope this patch lands in a future RHEL 7 kernel.  :-)

 	Thanks for your time and help,

 	Chris

[*] 31 instead of 32 (NFSD_CACHE_SIZE_SLOTS_PER_SESSION) because
     'slotsize' (from 'slot_bytes()') ends up being slightly larger than
     2048. (NFSD_SLOT_CACHE_SIZE)

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-02-27 20:48 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-18 21:46 Linux NFS v4.1 server support for dynamic slot allocation? Chris Tracy
2019-02-20 17:10 ` J. Bruce Fields
2019-02-20 19:07   ` Chris Tracy
2019-02-21 16:27     ` J. Bruce Fields
2019-02-27 20:48       ` Chris Tracy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.