All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jiri Slaby <jslaby@suse.cz>
To: akpm@linux-foundation.org
Cc: linux-kernel@vger.kernel.org, Jiri Slaby <jslaby@suse.cz>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@suse.com>,
	Vladimir Davydov <vdavydov.dev@gmail.com>,
	Shakeel Butt <shakeelb@google.com>,
	cgroups@vger.kernel.org, stable@vger.kernel.org,
	linux-mm@kvack.org,
	Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Subject: [PATCH -resend v2] memcg: make it work on sparse non-0-node systems
Date: Wed, 22 May 2019 11:19:40 +0200	[thread overview]
Message-ID: <20190522091940.3615-1-jslaby@suse.cz> (raw)
In-Reply-To: <20190517114204.6330-1-jslaby@suse.cz>

We have a single node system with node 0 disabled:
  Scanning NUMA topology in Northbridge 24
  Number of physical nodes 2
  Skipping disabled node 0
  Node 1 MemBase 0000000000000000 Limit 00000000fbff0000
  NODE_DATA(1) allocated [mem 0xfbfda000-0xfbfeffff]

This causes crashes in memcg when system boots:
  BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
  #PF error: [normal kernel read fault]
...
  RIP: 0010:list_lru_add+0x94/0x170
...
  Call Trace:
   d_lru_add+0x44/0x50
   dput.part.34+0xfc/0x110
   __fput+0x108/0x230
   task_work_run+0x9f/0xc0
   exit_to_usermode_loop+0xf5/0x100

It is reproducible as far as 4.12. I did not try older kernels. You have
to have a new enough systemd, e.g. 241 (the reason is unknown -- was not
investigated). Cannot be reproduced with systemd 234.

The system crashes because the size of lru array is never updated in
memcg_update_all_list_lrus and the reads are past the zero-sized array,
causing dereferences of random memory.

The root cause are list_lru_memcg_aware checks in the list_lru code.
The test in list_lru_memcg_aware is broken: it assumes node 0 is always
present, but it is not true on some systems as can be seen above.

So fix this by avoiding checks on node 0. Remember the memcg-awareness
by a bool flag in struct list_lru.

[v2] use the idea proposed by Vladimir -- the bool flag.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Fixes: 60d3fd32a7a9 ("list_lru: introduce per-memcg lists")
Cc: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Suggested-by: Vladimir Davydov <vdavydov.dev@gmail.com>
Acked-by: Vladimir Davydov <vdavydov.dev@gmail.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Cc: <cgroups@vger.kernel.org>
Cc: <stable@vger.kernel.org>
Cc: <linux-mm@kvack.org>
Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
---

This is only a resent patch. I did not send it the akpm's way previously.

 include/linux/list_lru.h | 1 +
 mm/list_lru.c            | 8 +++-----
 2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/include/linux/list_lru.h b/include/linux/list_lru.h
index aa5efd9351eb..d5ceb2839a2d 100644
--- a/include/linux/list_lru.h
+++ b/include/linux/list_lru.h
@@ -54,6 +54,7 @@ struct list_lru {
 #ifdef CONFIG_MEMCG_KMEM
 	struct list_head	list;
 	int			shrinker_id;
+	bool			memcg_aware;
 #endif
 };
 
diff --git a/mm/list_lru.c b/mm/list_lru.c
index 0730bf8ff39f..d3b538146efd 100644
--- a/mm/list_lru.c
+++ b/mm/list_lru.c
@@ -37,11 +37,7 @@ static int lru_shrinker_id(struct list_lru *lru)
 
 static inline bool list_lru_memcg_aware(struct list_lru *lru)
 {
-	/*
-	 * This needs node 0 to be always present, even
-	 * in the systems supporting sparse numa ids.
-	 */
-	return !!lru->node[0].memcg_lrus;
+	return lru->memcg_aware;
 }
 
 static inline struct list_lru_one *
@@ -451,6 +447,8 @@ static int memcg_init_list_lru(struct list_lru *lru, bool memcg_aware)
 {
 	int i;
 
+	lru->memcg_aware = memcg_aware;
+
 	if (!memcg_aware)
 		return 0;
 
-- 
2.21.0


  parent reply	other threads:[~2019-05-22  9:19 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-29  8:16 memcg causes crashes in list_lru_add Jiri Slaby
2019-04-29  9:25 ` Jiri Slaby
2019-04-29 10:09   ` Jiri Slaby
2019-04-29 10:40     ` Michal Hocko
2019-04-29 10:43       ` Michal Hocko
2019-04-29 10:59     ` [PATCH] memcg: make it work on sparse non-0-node systems Jiri Slaby
2019-04-29 11:30       ` Michal Hocko
2019-04-29 11:55         ` Jiri Slaby
2019-04-29 12:11           ` Jiri Slaby
2019-04-29 13:15           ` Michal Hocko
2019-05-09  7:21       ` Jiri Slaby
2019-05-09 12:25       ` Vladimir Davydov
2019-05-09 16:05         ` Shakeel Butt
2019-05-09 16:05           ` Shakeel Butt
2019-05-16 13:59         ` Michal Hocko
2019-05-17  4:48           ` Jiri Slaby
2019-05-17  8:00             ` Vladimir Davydov
2019-05-17  8:16               ` Jiri Slaby
2019-05-17 11:42               ` [PATCH v2] " Jiri Slaby
2019-05-17 12:13                 ` Shakeel Butt
2019-05-17 12:13                   ` Shakeel Butt
2019-05-17 12:27                 ` Michal Hocko
2019-05-22  9:19                 ` Jiri Slaby [this message]
2019-05-29 13:14                   ` [PATCH -resend " Sasha Levin
2019-04-29 10:17   ` memcg causes crashes in list_lru_add Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190522091940.3615-1-jslaby@suse.cz \
    --to=jslaby@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=raghavendra.kt@linux.vnet.ibm.com \
    --cc=shakeelb@google.com \
    --cc=stable@vger.kernel.org \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.