All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] mm/memcg: fix refcount error while moving and swapping
@ 2020-07-07 21:38 ` Hugh Dickins
  0 siblings, 0 replies; 2+ messages in thread
From: Hugh Dickins @ 2020-07-07 21:38 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Johannes Weiner, Alex Shi, Shakeel Butt, Hugh Dickins,
	linux-kernel, linux-mm

It was hard to keep a test running, moving tasks between memcgs with
move_charge_at_immigrate, while swapping: mem_cgroup_id_get_many()'s
refcount is discovered to be 0 (supposedly impossible), so it is then
forced to REFCOUNT_SATURATED, and after thousands of warnings in quick
succession, the test is at last put out of misery by being OOM killed.

This is because of the way moved_swap accounting was saved up until the
task move gets completed in __mem_cgroup_clear_mc(), deferred from when
mem_cgroup_move_swap_account() actually exchanged old and new ids.
Concurrent activity can free up swap quicker than the task is scanned,
bringing id refcount down 0 (which should only be possible when offlining).

Just skip that optimization: do that part of the accounting immediately.

Fixes: 615d66c37c75 ("mm: memcontrol: fix memcg id ref counter on swap charge move")
Cc: <stable@vger.kernel.org>
Signed-off-by: Hugh Dickins <hughd@google.com>
---
This was frustrating while testing Alex Shi's patches a few weeks
ago, and no fault of those.  I may have misattributed the "Fixes",
which was itself fixing an earlier, which were both backported to v3.19;
or maybe it goes back way further than those, I didn't pursue it - not
top of the list of user complaints!  Certainly goes back before the
refcount_add() in v4.20, which replaced a VM_BUG_ON(atomic_read <= 0).

 mm/memcontrol.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- 5.8-rc4/mm/memcontrol.c	2020-06-28 15:52:13.360672658 -0700
+++ linux/mm/memcontrol.c	2020-07-05 18:11:51.136542439 -0700
@@ -5669,7 +5669,6 @@ static void __mem_cgroup_clear_mc(void)
 		if (!mem_cgroup_is_root(mc.to))
 			page_counter_uncharge(&mc.to->memory, mc.moved_swap);
 
-		mem_cgroup_id_get_many(mc.to, mc.moved_swap);
 		css_put_many(&mc.to->css, mc.moved_swap);
 
 		mc.moved_swap = 0;
@@ -5860,7 +5859,8 @@ put:			/* get_mctgt_type() gets the page
 			ent = target.ent;
 			if (!mem_cgroup_move_swap_account(ent, mc.from, mc.to)) {
 				mc.precharge--;
-				/* we fixup refcnts and charges later. */
+				mem_cgroup_id_get_many(mc.to, 1);
+				/* we fixup other refcnts and charges later. */
 				mc.moved_swap++;
 			}
 			break;

^ permalink raw reply	[flat|nested] 2+ messages in thread

* [PATCH] mm/memcg: fix refcount error while moving and swapping
@ 2020-07-07 21:38 ` Hugh Dickins
  0 siblings, 0 replies; 2+ messages in thread
From: Hugh Dickins @ 2020-07-07 21:38 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Johannes Weiner, Alex Shi, Shakeel Butt, Hugh Dickins,
	linux-kernel, linux-mm

It was hard to keep a test running, moving tasks between memcgs with
move_charge_at_immigrate, while swapping: mem_cgroup_id_get_many()'s
refcount is discovered to be 0 (supposedly impossible), so it is then
forced to REFCOUNT_SATURATED, and after thousands of warnings in quick
succession, the test is at last put out of misery by being OOM killed.

This is because of the way moved_swap accounting was saved up until the
task move gets completed in __mem_cgroup_clear_mc(), deferred from when
mem_cgroup_move_swap_account() actually exchanged old and new ids.
Concurrent activity can free up swap quicker than the task is scanned,
bringing id refcount down 0 (which should only be possible when offlining).

Just skip that optimization: do that part of the accounting immediately.

Fixes: 615d66c37c75 ("mm: memcontrol: fix memcg id ref counter on swap charge move")
Cc: <stable@vger.kernel.org>
Signed-off-by: Hugh Dickins <hughd@google.com>
---
This was frustrating while testing Alex Shi's patches a few weeks
ago, and no fault of those.  I may have misattributed the "Fixes",
which was itself fixing an earlier, which were both backported to v3.19;
or maybe it goes back way further than those, I didn't pursue it - not
top of the list of user complaints!  Certainly goes back before the
refcount_add() in v4.20, which replaced a VM_BUG_ON(atomic_read <= 0).

 mm/memcontrol.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- 5.8-rc4/mm/memcontrol.c	2020-06-28 15:52:13.360672658 -0700
+++ linux/mm/memcontrol.c	2020-07-05 18:11:51.136542439 -0700
@@ -5669,7 +5669,6 @@ static void __mem_cgroup_clear_mc(void)
 		if (!mem_cgroup_is_root(mc.to))
 			page_counter_uncharge(&mc.to->memory, mc.moved_swap);
 
-		mem_cgroup_id_get_many(mc.to, mc.moved_swap);
 		css_put_many(&mc.to->css, mc.moved_swap);
 
 		mc.moved_swap = 0;
@@ -5860,7 +5859,8 @@ put:			/* get_mctgt_type() gets the page
 			ent = target.ent;
 			if (!mem_cgroup_move_swap_account(ent, mc.from, mc.to)) {
 				mc.precharge--;
-				/* we fixup refcnts and charges later. */
+				mem_cgroup_id_get_many(mc.to, 1);
+				/* we fixup other refcnts and charges later. */
 				mc.moved_swap++;
 			}
 			break;


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2020-07-07 21:38 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-07 21:38 [PATCH] mm/memcg: fix refcount error while moving and swapping Hugh Dickins
2020-07-07 21:38 ` Hugh Dickins

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.