linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] nfsd: more robust allocation failure handling in nfsd_reply_cache_init
@ 2016-08-30 11:48 Jeff Layton
  2016-08-30 18:23 ` Linus Torvalds
  0 siblings, 1 reply; 4+ messages in thread
From: Jeff Layton @ 2016-08-30 11:48 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs, linux-kernel, Olaf Hering, Linus Torvalds

Currently, we try to allocate the cache as a single, large chunk, which
can fail if no big chunks of memory are available. We _do_ try to size
it according to the amount of memory in the box, but if the server is
started well after boot time, then the allocation can fail due to memory
fragmentation.

Try to handle this more gracefully by cutting the max_drc_entries in
half and then retrying if the allocation fails. Only give up if the
failed allocation is smaller than a page.

Reported-by: Olaf Hering <olaf@aepfle.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
---
 fs/nfsd/nfscache.c | 27 +++++++++++++++++++++++----
 1 file changed, 23 insertions(+), 4 deletions(-)

While this would be good to get in, I don't see any particular urgency
here. This seems like it'd be reasonable for v4.9.

diff --git a/fs/nfsd/nfscache.c b/fs/nfsd/nfscache.c
index 54cde9a5864e..b8aaa7a71412 100644
--- a/fs/nfsd/nfscache.c
+++ b/fs/nfsd/nfscache.c
@@ -155,14 +155,12 @@ nfsd_reply_cache_free(struct nfsd_drc_bucket *b, struct svc_cacherep *rp)
 
 int nfsd_reply_cache_init(void)
 {
-	unsigned int hashsize;
+	unsigned int hashsize, target_hashsize;
 	unsigned int i;
 	int status = 0;
 
 	max_drc_entries = nfsd_cache_size_limit();
 	atomic_set(&num_drc_entries, 0);
-	hashsize = nfsd_hashsize(max_drc_entries);
-	maskbits = ilog2(hashsize);
 
 	status = register_shrinker(&nfsd_reply_cache_shrinker);
 	if (status)
@@ -173,9 +171,30 @@ int nfsd_reply_cache_init(void)
 	if (!drc_slab)
 		goto out_nomem;
 
-	drc_hashtbl = kcalloc(hashsize, sizeof(*drc_hashtbl), GFP_KERNEL);
+	/*
+	 * Attempt to allocate the hashtable, and progressively shrink the
+	 * size as the allocations fail. If the allocation size ends up being
+	 * smaller than a page however, then just give up.
+	 */
+	target_hashsize = nfsd_hashsize(max_drc_entries);
+	hashsize = target_hashsize;
+	do {
+		maskbits = ilog2(hashsize);
+		drc_hashtbl = kcalloc(hashsize, sizeof(*drc_hashtbl),
+					GFP_KERNEL|__GFP_NOWARN);
+		if (drc_hashtbl)
+			break;
+		max_drc_entries /= 2;
+		hashsize = nfsd_hashsize(max_drc_entries);
+	} while ((hashsize * sizeof(*drc_hashtbl)) >= PAGE_SIZE);
+
 	if (!drc_hashtbl)
 		goto out_nomem;
+
+	if (hashsize != target_hashsize)
+		pr_warn("NFSD: had to shrink reply cache hashtable (wanted %u, got %u)\n",
+			target_hashsize, hashsize);
+
 	for (i = 0; i < hashsize; i++) {
 		INIT_LIST_HEAD(&drc_hashtbl[i].lru_head);
 		spin_lock_init(&drc_hashtbl[i].cache_lock);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] nfsd: more robust allocation failure handling in nfsd_reply_cache_init
  2016-08-30 11:48 [PATCH] nfsd: more robust allocation failure handling in nfsd_reply_cache_init Jeff Layton
@ 2016-08-30 18:23 ` Linus Torvalds
  2016-10-20 15:52   ` Bruce Fields
  0 siblings, 1 reply; 4+ messages in thread
From: Linus Torvalds @ 2016-08-30 18:23 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Bruce Fields, Linux NFS Mailing List, Linux Kernel Mailing List,
	Olaf Hering

On Tue, Aug 30, 2016 at 4:48 AM, Jeff Layton <jlayton@redhat.com> wrote:
>
> While this would be good to get in, I don't see any particular urgency
> here. This seems like it'd be reasonable for v4.9.

Agreed, looks ok to me. It certainly does not look like a new
regression or like a serious problem issue in practice. So 4.9 sounds
appropriate.

            Linus

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] nfsd: more robust allocation failure handling in nfsd_reply_cache_init
  2016-08-30 18:23 ` Linus Torvalds
@ 2016-10-20 15:52   ` Bruce Fields
  2016-10-20 16:09     ` Linus Torvalds
  0 siblings, 1 reply; 4+ messages in thread
From: Bruce Fields @ 2016-10-20 15:52 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jeff Layton, Linux NFS Mailing List, Linux Kernel Mailing List,
	Olaf Hering

On Tue, Aug 30, 2016 at 11:23:36AM -0700, Linus Torvalds wrote:
> On Tue, Aug 30, 2016 at 4:48 AM, Jeff Layton <jlayton@redhat.com> wrote:
> >
> > While this would be good to get in, I don't see any particular urgency
> > here. This seems like it'd be reasonable for v4.9.
> 
> Agreed, looks ok to me. It certainly does not look like a new
> regression or like a serious problem issue in practice. So 4.9 sounds
> appropriate.

Gah, Jeff points out I forgot to merge this.

Jeff was also wondering whether we could instead just allocate this with
vmalloc--is there any drawback?  We only allocate this on nfsd startup,
so if the only drawback is the allocation itself being expensive then
that's no big deal.

--b.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] nfsd: more robust allocation failure handling in nfsd_reply_cache_init
  2016-10-20 15:52   ` Bruce Fields
@ 2016-10-20 16:09     ` Linus Torvalds
  0 siblings, 0 replies; 4+ messages in thread
From: Linus Torvalds @ 2016-10-20 16:09 UTC (permalink / raw)
  To: Bruce Fields
  Cc: Jeff Layton, Linux NFS Mailing List, Linux Kernel Mailing List,
	Olaf Hering

On Thu, Oct 20, 2016 at 8:52 AM, Bruce Fields <bfields@fieldses.org> wrote:
>
> Jeff was also wondering whether we could instead just allocate this with
> vmalloc--is there any drawback?  We only allocate this on nfsd startup,
> so if the only drawback is the allocation itself being expensive then
> that's no big deal.

vmalloc is ok. Generally if it's *usually* a small allocation, the
best pattern tends to be to first try to kmalloc (of get_free_pages())
using __GFP_NORETRY | __GFP_NOWARN, and then fall back on vmalloc().
That way you don't end up doing vmalloc's for things that really don't
need it.

If you do that, we have a "kvfree()" helper that is "free either
kmalloc or vmalloc area", so you don't have to track after-the-fact
which one you did.

               Linus

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-10-20 16:09 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-30 11:48 [PATCH] nfsd: more robust allocation failure handling in nfsd_reply_cache_init Jeff Layton
2016-08-30 18:23 ` Linus Torvalds
2016-10-20 15:52   ` Bruce Fields
2016-10-20 16:09     ` Linus Torvalds

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).