All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chuck Lever III <chuck.lever@oracle.com>
To: Dai Ngo <dai.ngo@oracle.com>
Cc: Linux NFS Mailing List <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH 2/2] NFSD: limit the number of v4 clients to 4096 per 4GB of system memory
Date: Tue, 12 Jul 2022 21:54:12 +0000	[thread overview]
Message-ID: <D112EBD0-D062-498C-A15F-65A44097AC6B@oracle.com> (raw)
In-Reply-To: <1657656660-16647-3-git-send-email-dai.ngo@oracle.com>

Hello Dai, lovely to see this!


> On Jul 12, 2022, at 4:11 PM, Dai Ngo <dai.ngo@oracle.com> wrote:
> 
> Currently there is no limit on how many v4 clients are supported
> by the system. This can be a problem in systems with small memory
> configuration to function properly when a very large number of
> clients exist that creates memory shortage conditions.
> 
> This patch enforces a limit of 4096 NFSv4 clients, including courtesy
> clients, per 4GB of system memory.  When the number of the clients
> reaches the limit, requests that create new clients are returned
> with NFS4ERR_DELAY. The laundromat detects this condition and removes
> older courtesy clients. Due to the overhead of the upcall to remove
> the client record, the maximun number of clients the laundromat
> removes on each run is limited to 128. This is done to ensure the
> laundromat can still process other tasks in a timely manner.
> 
> Since there is now a limit of the number of clients, the 24-hr
> idle time limit of courtesy client is no longer needed and was
> removed.
> 
> Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
> ---
> fs/nfsd/netns.h     |  1 +
> fs/nfsd/nfs4state.c | 17 +++++++++++++----
> fs/nfsd/nfsctl.c    |  8 ++++++++
> 3 files changed, 22 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
> index ce864f001a3e..8d72b302a49c 100644
> --- a/fs/nfsd/netns.h
> +++ b/fs/nfsd/netns.h
> @@ -191,6 +191,7 @@ struct nfsd_net {
> 	siphash_key_t		siphash_key;
> 
> 	atomic_t		nfs4_client_count;
> +	unsigned int		nfs4_max_clients;
> };
> 
> /* Simple check to find out if a given net was properly initialized */
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 30e16d9e8657..e54db346dc00 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -126,6 +126,7 @@ static const struct nfsd4_callback_ops nfsd4_cb_recall_ops;
> static const struct nfsd4_callback_ops nfsd4_cb_notify_lock_ops;
> 
> static struct workqueue_struct *laundry_wq;
> +#define	NFSD_CLIENT_MAX_TRIM_PER_RUN	128

Let's move these #defines to a header file instead of scattering
them in the source code. How about fs/nfsd/nfsd.h ?


> int nfsd4_create_laundry_wq(void)
> {
> @@ -2059,6 +2060,8 @@ static struct nfs4_client *alloc_client(struct xdr_netobj name,
> 	struct nfs4_client *clp;
> 	int i;
> 
> +	if (atomic_read(&nn->nfs4_client_count) >= nn->nfs4_max_clients)
> +		return NULL;

So, NFSD will return NFS4ERR_DELAY if it is asked to establish
a new client and we've hit this limit. The next laundromat run
should knock a few lingering COURTESY clients out of the LRU
to make room for a new client.

Maybe you want to kick the laundromat here to get that process
moving sooner?


> 	clp = kmem_cache_zalloc(client_slab, GFP_KERNEL);
> 	if (clp == NULL)
> 		return NULL;
> @@ -5796,9 +5799,12 @@ static void
> nfs4_get_client_reaplist(struct nfsd_net *nn, struct list_head *reaplist,
> 				struct laundry_time *lt)
> {
> +	unsigned int maxreap = 0, reapcnt = 0;
> 	struct list_head *pos, *next;
> 	struct nfs4_client *clp;
> 
> +	if (atomic_read(&nn->nfs4_client_count) >= nn->nfs4_max_clients)
> +		maxreap = NFSD_CLIENT_MAX_TRIM_PER_RUN;

The idea I guess is "don't reap anything until we exceed the
maximum number of clients". It took me a bit to figure that
out.


> 	INIT_LIST_HEAD(reaplist);
> 	spin_lock(&nn->client_lock);
> 	list_for_each_safe(pos, next, &nn->client_lru) {
> @@ -5809,14 +5815,17 @@ nfs4_get_client_reaplist(struct nfsd_net *nn, struct list_head *reaplist,
> 			break;
> 		if (!atomic_read(&clp->cl_rpc_users))
> 			clp->cl_state = NFSD4_COURTESY;
> -		if (!client_has_state(clp) ||
> -				ktime_get_boottime_seconds() >=
> -				(clp->cl_time + NFSD_COURTESY_CLIENT_TIMEOUT))
> +		if (!client_has_state(clp))
> 			goto exp_client;
> 		if (nfs4_anylock_blockers(clp)) {
> exp_client:
> -			if (!mark_client_expired_locked(clp))
> +			if (!mark_client_expired_locked(clp)) {
> 				list_add(&clp->cl_lru, reaplist);
> +				reapcnt++;
> +			}
> +		} else {
> +			if (reapcnt < maxreap)
> +				goto exp_client;
> 		}
> 	}

Would something like this be more straightforward? I probably
didn't get the logic exactly right.

		if (!nfs4_anylock_blockers(clp))
			if (reapcnt > maxreap)
				continue;
exp_client:
		if (!mark_client_expired_locked(clp)) {
			list_add(&clp->cl_lru, reaplist);
			reapcnt++;
		}
	}

The idea is: once maxreap has been reached, continue walking the
LRU looking for clients to convert from ACTIVE to COURTESY, but
do not reap any more COURTESY clients that might be found.


> 	spin_unlock(&nn->client_lock);
> diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> index 547f4c4b9668..223659e15af3 100644
> --- a/fs/nfsd/nfsctl.c
> +++ b/fs/nfsd/nfsctl.c
> @@ -96,6 +96,8 @@ static ssize_t (*const write_op[])(struct file *, char *, size_t) = {
> #endif
> };
> 
> +#define	NFS4_MAX_CLIENTS_PER_4GB	4096

No need for "MAX" in this name.

And, ditto the above comment: move this to a header file.


> +
> static ssize_t nfsctl_transaction_write(struct file *file, const char __user *buf, size_t size, loff_t *pos)
> {
> 	ino_t ino =  file_inode(file)->i_ino;
> @@ -1462,6 +1464,8 @@ unsigned int nfsd_net_id;
> static __net_init int nfsd_init_net(struct net *net)
> {
> 	int retval;
> +	unsigned long lowmem;
> +	struct sysinfo si;
> 	struct nfsd_net *nn = net_generic(net, nfsd_net_id);

Nit: I prefer the reverse christmas tree style. Can you add
the new stack variables after "struct nfsd_net *nn ..." ?


> 
> 	retval = nfsd_export_init(net);
> @@ -1488,6 +1492,10 @@ static __net_init int nfsd_init_net(struct net *net)
> 	seqlock_init(&nn->writeverf_lock);
> 
> 	atomic_set(&nn->nfs4_client_count, 0);
> +	si_meminfo(&si);
> +	lowmem = (si.totalram - si.totalhigh) * si.mem_unit;

There's no reason to restrict this to lowmem, since we're not
using a struct nfs4_client as the target of I/O.


> +	nn->nfs4_max_clients = (((lowmem * 100) >> 32) *
> +				NFS4_MAX_CLIENTS_PER_4GB) / 100;

On a platform where "unsigned long" is a 32-bit type, will
the shift-right-by-32 continue to work as you expect?

Let's try to simplify this computation, because it isn't
especially clear what is going on. The math might work a
little better if it were "1024 clients per GB" for example.


> 	return 0;
> 
> -- 
> 2.9.5
> 

--
Chuck Lever




  reply	other threads:[~2022-07-12 21:54 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-12 20:10 [PATCH 0/2] limit the number of v4 clients to 4096 per 4GB of system memory Dai Ngo
2022-07-12 20:10 ` [PATCH 1/2] NFSD: keep track of the number of v4 clients in the system Dai Ngo
2022-07-12 20:11 ` [PATCH 2/2] NFSD: limit the number of v4 clients to 4096 per 4GB of system memory Dai Ngo
2022-07-12 21:54   ` Chuck Lever III [this message]
2022-07-12 22:30     ` dai.ngo
2022-07-13 17:31       ` Chuck Lever III
2022-07-15  1:14   ` kernel test robot
  -- strict thread matches above, loose matches on Subject: below --
2022-07-12 20:09 Dai Ngo
2022-07-12 20:09 ` [PATCH 2/2] NFSD: limit the number of v4 clients to 4096 per 4GB of system memory Dai Ngo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=D112EBD0-D062-498C-A15F-65A44097AC6B@oracle.com \
    --to=chuck.lever@oracle.com \
    --cc=dai.ngo@oracle.com \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.