All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vlastimil Babka <vbabka@suse.cz>
To: linux-mm@kvack.org
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Muchun Song <songmuchun@bytedance.com>,
	Chris Down <chris@chrisdown.name>,
	Michal Hocko <mhocko@kernel.org>,
	Matthew Wilcox <willy@infradead.org>,
	Vlastimil Babka <vbabka@suse.cz>,
	Chunxin Zang <zangchunxin@bytedance.com>
Subject: [PATCH] mm, vmscan: guarantee drop_slab_node() termination
Date: Wed, 18 Aug 2021 17:22:39 +0200	[thread overview]
Message-ID: <20210818152239.25502-1-vbabka@suse.cz> (raw)

drop_slab_node() is called as part of echo 2>/proc/sys/vm/drop_caches
operation. It iterates over all memcgs and calls shrink_slab() which in turn
iterates over all slab shrinkers. Freed objects are counted and as long as the
total number of freed objects from all memcgs and shrinkers is higher than 10,
drop_slab_node() loops for another full memcgs*shrinkers iteration.

This arbitrary constant threshold of 10 can result in effectively an infinite
loop on a system with large number of memcgs and/or parallel activity that
allocates new objects. This has been reported previously by Chunxin Zang [1]
and recently by our customer.

The previous report [1] has resulted in commit 069c411de40a ("mm/vmscan: fix
infinite loop in drop_slab_node") which added a check for signals allowing the
user to terminate the command writing to drop_caches. At the time it was also
considered to make the threshold grow with each iteration to guarantee
termination, but such patch hasn't been formally proposed yet.

This patch implements the dynamically growing threshold. At first iteration
it's enough to free one object to continue, and this threshold effectively
doubles with each iteration. Our customer's feedback was positive.

There is always a risk that this change will result on some system in a
previously terminating drop_caches operation to terminate sooner and free fewer
objects. Ideally the semantics would guarantee freeing all freeable objects
that existed at the moment of starting the operation, while not looping forever
for newly allocated objects, but that's not feasible to track. In the less
ideal solution based on thresholds, arguably the termination guarantee is more
important than the exhaustiveness guarantee. If there are reports of large
regression wrt being exhaustive, we can tune how fast the threshold grows.

[1] https://lore.kernel.org/lkml/20200909152047.27905-1-zangchunxin@bytedance.com/T/#u

Reported-by: Chunxin Zang <zangchunxin@bytedance.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
 mm/vmscan.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 403a175a720f..ef3554314b47 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -936,6 +936,7 @@ static unsigned long shrink_slab(gfp_t gfp_mask, int nid,
 void drop_slab_node(int nid)
 {
 	unsigned long freed;
+	int shift = 0;
 
 	do {
 		struct mem_cgroup *memcg = NULL;
@@ -948,7 +949,7 @@ void drop_slab_node(int nid)
 		do {
 			freed += shrink_slab(GFP_KERNEL, nid, memcg, 0);
 		} while ((memcg = mem_cgroup_iter(NULL, memcg, NULL)) != NULL);
-	} while (freed > 10);
+	} while ((freed >> shift++) > 0);
 }
 
 void drop_slab(void)
-- 
2.32.0



             reply	other threads:[~2021-08-18 15:23 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-18 15:22 Vlastimil Babka [this message]
2021-08-18 21:48 ` [PATCH] mm, vmscan: guarantee drop_slab_node() termination Chris Down
2021-08-19  2:55   ` Kefeng Wang
2021-08-19  7:01     ` Vlastimil Babka
2021-08-19  9:38       ` Kefeng Wang
2021-08-19 13:21       ` Chris Down
2021-08-19 14:16         ` Michal Hocko
2021-08-24  9:33           ` Vlastimil Babka
2021-08-24 10:02 ` Matthew Wilcox
2021-08-24 14:04   ` Vlastimil Babka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210818152239.25502-1-vbabka@suse.cz \
    --to=vbabka@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=chris@chrisdown.name \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=songmuchun@bytedance.com \
    --cc=willy@infradead.org \
    --cc=zangchunxin@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.