All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] make nfsd_drc_max_mem configurable
@ 2015-06-17 12:48 Christoph Martin
  2015-06-18 16:16 ` J. Bruce Fields
  0 siblings, 1 reply; 4+ messages in thread
From: Christoph Martin @ 2015-06-17 12:48 UTC (permalink / raw)
  To: Andy Adamson, J. Bruce Fields; +Cc: Markus Tacke, linux-nfs

[-- Attachment #1: Type: text/plain, Size: 2702 bytes --]

Dear Andy, dear Bruce,

(sorry for the recent to you, I now cc'd linux-nfs)

I have attached a patch to nfssvc.diff to make the size of the drc nfsd
cache configurable.

In the last month we were stumbling twice over the problem that the
NFS4.1 session cache was to small.

The first time we wanted to setup a NFS Server for our HPC cluster. We
were wondering why we were only able to mount the filesystem on 380 of
our ~700 nodes. It took us a long time to find out that it was the limit
of the NFS4.1 session cache. Since this machine had 12G Ram, the kernel
reserved 12M for the cache, which results in 384 slots a 32k:

echo $(((12582912>>10)/32))
384

We patched the kernel redhat 7 kernel to change NFSD_DRC_SIZE_SHIFT to
from 10 to 7 to fix this problem.

The second time we installed a small Debian VM with 1G ram to act as a
NFS4 referral server for the home and group directories on our campus.
Since the server does only NFS referrals it does not really need more
memory than the 1G. But it could only server about 30 clients with this
limitation of the session cache.

I think it would be a good idea to have the amount of memory
configurable in nfsd. So I wrote this small patch to make drc_size
configurable while loading the kernel nfsd module.

The patch uses the old value computed from NFSD_DRC_SIZE_SHIFT as the
lower limit. If drc_size as a parameter for then nfsd is higher than a
1/1000 of the RAM, this value will be used.

One might consider to make NFSD_DRC_SIZE_SHIFT even higher to use less
memory for situations where it is not needed. I did not implement an
upper limit, but it might be important.

Please consider to include this patch into the nfsd code.

Yours
Christoph Martin

--- linux-source-3.16/fs/nfsd/nfssvc.c	2015-03-30 12:09:09.000000000 +0200
+++ linux-source-3.16.nfsd/fs/nfsd/nfssvc.c	2015-06-17
09:28:37.880443867 +0200
@@ -359,11 +359,19 @@ void nfsd_reset_versions(void)
  * For now this is a #defined shift which could be under admin control
  * in the future.
  */
+
+static ulong drc_size = 0;
+module_param(drc_size, ulong, 0444);
+MODULE_PARM_DESC(drc_size,
+		 "size of NFSv4.1 DRC cache memory (default and minimum:
free_buffer_size >> 10)");
+
 static void set_max_drc(void)
 {
 	#define NFSD_DRC_SIZE_SHIFT	10
-	nfsd_drc_max_mem = (nr_free_buffer_pages()
-					>> NFSD_DRC_SIZE_SHIFT) * PAGE_SIZE;
+	nfsd_drc_max_mem = max(drc_size,
+			       (nr_free_buffer_pages()
+				>> NFSD_DRC_SIZE_SHIFT) * PAGE_SIZE);
+	drc_size = nfsd_drc_max_mem;
 	nfsd_drc_mem_used = 0;
 	spin_lock_init(&nfsd_drc_lock);
 	dprintk("%s nfsd_drc_max_mem %lu \n", __func__, nfsd_drc_max_mem);



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] make nfsd_drc_max_mem configurable
  2015-06-17 12:48 [PATCH] make nfsd_drc_max_mem configurable Christoph Martin
@ 2015-06-18 16:16 ` J. Bruce Fields
  2015-07-06 12:59   ` Christoph Martin
  0 siblings, 1 reply; 4+ messages in thread
From: J. Bruce Fields @ 2015-06-18 16:16 UTC (permalink / raw)
  To: Christoph Martin; +Cc: Andy Adamson, Markus Tacke, linux-nfs

On Wed, Jun 17, 2015 at 02:48:04PM +0200, Christoph Martin wrote:
> Dear Andy, dear Bruce,
> 
> (sorry for the recent to you, I now cc'd linux-nfs)
> 
> I have attached a patch to nfssvc.diff to make the size of the drc nfsd
> cache configurable.
> 
> In the last month we were stumbling twice over the problem that the
> NFS4.1 session cache was to small.

Thanks for the careful investigation!

> The first time we wanted to setup a NFS Server for our HPC cluster. We
> were wondering why we were only able to mount the filesystem on 380 of
> our ~700 nodes. It took us a long time to find out that it was the limit
> of the NFS4.1 session cache. Since this machine had 12G Ram, the kernel
> reserved 12M for the cache, which results in 384 slots a 32k:
> 
> echo $(((12582912>>10)/32))
> 384

So each client is using 32k?

Might be interesting to take a look at the CREATE_SESSION call and reply
in wireshark (especially the values of maxresponsesize_cached and
maxrequests)--there might also be defaults there that need tweaking.

> We patched the kernel redhat 7 kernel to change NFSD_DRC_SIZE_SHIFT to
> from 10 to 7 to fix this problem.
> 
> The second time we installed a small Debian VM with 1G ram to act as a
> NFS4 referral server for the home and group directories on our campus.
> Since the server does only NFS referrals it does not really need more
> memory than the 1G. But it could only server about 30 clients with this
> limitation of the session cache.
> 
> I think it would be a good idea to have the amount of memory
> configurable in nfsd. So I wrote this small patch to make drc_size
> configurable while loading the kernel nfsd module.
> 
> The patch uses the old value computed from NFSD_DRC_SIZE_SHIFT as the
> lower limit. If drc_size as a parameter for then nfsd is higher than a
> 1/1000 of the RAM, this value will be used.
> 
> One might consider to make NFSD_DRC_SIZE_SHIFT even higher to use less
> memory for situations where it is not needed. I did not implement an
> upper limit, but it might be important.
> 
> Please consider to include this patch into the nfsd code.

Looks good, my one concern is that this covers only the size of the 4.1
session cache.  We may need to add some more limits in the future and
might not want to require separate configuration of each limit.

Maybe one or two more generic size parameters would be more useful?
Like:

	- Maximum memory to devote to knfsd
	- Maximum memory to devote to a single client

--b.



> 
> Yours
> Christoph Martin
> 
> --- linux-source-3.16/fs/nfsd/nfssvc.c	2015-03-30 12:09:09.000000000 +0200
> +++ linux-source-3.16.nfsd/fs/nfsd/nfssvc.c	2015-06-17
> 09:28:37.880443867 +0200
> @@ -359,11 +359,19 @@ void nfsd_reset_versions(void)
>   * For now this is a #defined shift which could be under admin control
>   * in the future.
>   */
> +
> +static ulong drc_size = 0;
> +module_param(drc_size, ulong, 0444);
> +MODULE_PARM_DESC(drc_size,
> +		 "size of NFSv4.1 DRC cache memory (default and minimum:
> free_buffer_size >> 10)");
> +
>  static void set_max_drc(void)
>  {
>  	#define NFSD_DRC_SIZE_SHIFT	10
> -	nfsd_drc_max_mem = (nr_free_buffer_pages()
> -					>> NFSD_DRC_SIZE_SHIFT) * PAGE_SIZE;
> +	nfsd_drc_max_mem = max(drc_size,
> +			       (nr_free_buffer_pages()
> +				>> NFSD_DRC_SIZE_SHIFT) * PAGE_SIZE);
> +	drc_size = nfsd_drc_max_mem;
>  	nfsd_drc_mem_used = 0;
>  	spin_lock_init(&nfsd_drc_lock);
>  	dprintk("%s nfsd_drc_max_mem %lu \n", __func__, nfsd_drc_max_mem);
> 
> 



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] make nfsd_drc_max_mem configurable
  2015-06-18 16:16 ` J. Bruce Fields
@ 2015-07-06 12:59   ` Christoph Martin
  2015-08-20  9:30     ` Christoph Martin
  0 siblings, 1 reply; 4+ messages in thread
From: Christoph Martin @ 2015-07-06 12:59 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: Andy Adamson, Markus Tacke, linux-nfs


[-- Attachment #1.1: Type: text/plain, Size: 4041 bytes --]

Hi Bruce,

Am 18.06.2015 um 18:16 schrieb J. Bruce Fields:
> 
>> The first time we wanted to setup a NFS Server for our HPC cluster. We
>> were wondering why we were only able to mount the filesystem on 380 of
>> our ~700 nodes. It took us a long time to find out that it was the limit
>> of the NFS4.1 session cache. Since this machine had 12G Ram, the kernel
>> reserved 12M for the cache, which results in 384 slots a 32k:
>>
>> echo $(((12582912>>10)/32))
>> 384
> 
> So each client is using 32k?

#define NFSD_SLOT_CACHE_SIZE		2048
/* Maximum number of NFSD_SLOT_CACHE_SIZE slots per session */
#define NFSD_CACHE_SIZE_SLOTS_PER_SESSION	32
#define NFSD_MAX_MEM_PER_SESSION  \
		(NFSD_CACHE_SIZE_SLOTS_PER_SESSION * NFSD_SLOT_CACHE_SIZE)

So this would be 64k. Maybe I missed a factor of 2 somewhere. But the
calculation above equals the experience in our tests.

> 
> Might be interesting to take a look at the CREATE_SESSION call and reply
> in wireshark (especially the values of maxresponsesize_cached and
> maxrequests)--there might also be defaults there that need tweaking.
> 
>> We patched the kernel redhat 7 kernel to change NFSD_DRC_SIZE_SHIFT to
>> from 10 to 7 to fix this problem.
>>
>> The second time we installed a small Debian VM with 1G ram to act as a
>> NFS4 referral server for the home and group directories on our campus.
>> Since the server does only NFS referrals it does not really need more
>> memory than the 1G. But it could only server about 30 clients with this
>> limitation of the session cache.
>>
>> I think it would be a good idea to have the amount of memory
>> configurable in nfsd. So I wrote this small patch to make drc_size
>> configurable while loading the kernel nfsd module.
>>
>> The patch uses the old value computed from NFSD_DRC_SIZE_SHIFT as the
>> lower limit. If drc_size as a parameter for then nfsd is higher than a
>> 1/1000 of the RAM, this value will be used.
>>
>> One might consider to make NFSD_DRC_SIZE_SHIFT even higher to use less
>> memory for situations where it is not needed. I did not implement an
>> upper limit, but it might be important.
>>
>> Please consider to include this patch into the nfsd code.
> 
> Looks good, 

As far as I understand the code now, it would even be possible to change
the value of nfsd_drc_max_mem during runtime of nfsd, since the value is
only used in nfsd4_get_drc_mem in nfs4state.c. I don't see that the
limit is on the slab. It seems only to be on the local usage of the slab.

> my one concern is that this covers only the size of the 4.1
> session cache.  We may need to add some more limits in the future and
> might not want to require separate configuration of each limit.
> 
> Maybe one or two more generic size parameters would be more useful?
> Like:
> 
> 	- Maximum memory to devote to knfsd
> 	- Maximum memory to devote to a single client
> 

We were discussing this and don't think that this is a good idea.

If you have only one limit per knfsd or client, you don't know how many
memory you have to assign to the different memory slabs, like drc or
others, because you can't know in advance if the nfs server is only used
for say nfs3 or nfs4 or nfs4.1 or a mixture.

So if you think it is necessary to have a global limit, you also need
tunabels for the distribution of the available memory to the different
protocols.

I found the following calls to kmem_cache_create:

> 
> nfs4state.c:2635:	openowner_slab = kmem_cache_create("nfsd4_openowners",
> nfs4state.c:2639:	lockowner_slab = kmem_cache_create("nfsd4_lockowners",
> nfs4state.c:2643:	file_slab = kmem_cache_create("nfsd4_files",
> nfs4state.c:2647:	stateid_slab = kmem_cache_create("nfsd4_stateids",
> nfs4state.c:2651:	deleg_slab = kmem_cache_create("nfsd4_delegations",
> nfscache.c:168:	drc_slab = kmem_cache_create("nfsd_drc", sizeof(struct svc_cacherep),

At a first look, they all use different methods to administer this
memory if at all.

Yours
Christoph

[-- Attachment #1.2: martin.vcf --]
[-- Type: text/x-vcard, Size: 195 bytes --]

begin:vcard
fn:Christoph Martin
n:Martin;Christoph
email;internet:martin@uni-mainz.de
tel;work:+49-6131-3926337
tel;fax:+49-6131-3926407
tel;cell:+49-179-7952652
version:2.1
end:vcard


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] make nfsd_drc_max_mem configurable
  2015-07-06 12:59   ` Christoph Martin
@ 2015-08-20  9:30     ` Christoph Martin
  0 siblings, 0 replies; 4+ messages in thread
From: Christoph Martin @ 2015-08-20  9:30 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: Andy Adamson, Markus Tacke, linux-nfs

[-- Attachment #1: Type: text/plain, Size: 788 bytes --]

Hi Bruce

> 
> I found the following calls to kmem_cache_create:
> 
>>
>> nfs4state.c:2635:	openowner_slab = kmem_cache_create("nfsd4_openowners",
>> nfs4state.c:2639:	lockowner_slab = kmem_cache_create("nfsd4_lockowners",
>> nfs4state.c:2643:	file_slab = kmem_cache_create("nfsd4_files",
>> nfs4state.c:2647:	stateid_slab = kmem_cache_create("nfsd4_stateids",
>> nfs4state.c:2651:	deleg_slab = kmem_cache_create("nfsd4_delegations",
>> nfscache.c:168:	drc_slab = kmem_cache_create("nfsd_drc", sizeof(struct svc_cacherep),
> 
> At a first look, they all use different methods to administer this
> memory if at all.
> 

Another solution would be to simply remove the upper limit for the drc
cache size.

What do you think? How do we solve this problem?

Christoph


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-08-20  9:40 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-17 12:48 [PATCH] make nfsd_drc_max_mem configurable Christoph Martin
2015-06-18 16:16 ` J. Bruce Fields
2015-07-06 12:59   ` Christoph Martin
2015-08-20  9:30     ` Christoph Martin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.