All of lore.kernel.org
 help / color / mirror / Atom feed
From: Miles Chen <miles.chen@mediatek.com>
To: <gregkh@linuxfoundation.org>
Cc: <akpm@linux-foundation.org>, <cai@lca.pw>, <hannes@cmpxchg.org>,
	<mhocko@suse.com>, <stable@vger.kernel.org>,
	<torvalds@linux-foundation.org>, <vdavydov.dev@gmail.com>
Subject: Re: FAILED: patch "[PATCH] mm/memcontrol.c: fix use after free in mem_cgroup_iter()" failed to apply to 4.4-stable tree
Date: Fri, 16 Aug 2019 19:39:54 +0800	[thread overview]
Message-ID: <1565955594.26404.6.camel@mtkswgap22> (raw)
In-Reply-To: <1565949870255118@kroah.com>

On Fri, 2019-08-16 at 12:04 +0200, gregkh@linuxfoundation.org wrote:
> The patch below does not apply to the 4.4-stable tree.
> If someone wants it applied there, or to any other stable or longterm
> tree, then please email the backport, including the original git commit
> id to <stable@vger.kernel.org>.
> 
> thanks,
> 
> greg k-h
> 
Hi Greg,

Below is the backport for 4.4

cheers,
Miles

From 492948a33742705cd4d53f229d2bb512ace5301b Mon Sep 17 00:00:00 2001
From: Miles Chen <miles.chen@mediatek.com>
Date: Fri, 16 Aug 2019 19:32:03 +0800
Subject: [PATCH] BACKPORT: mm/memcontrol.c: fix use after free in
 mem_cgroup_iter()

original commit id: 54a83d6bcbf8f4700013766b974bf9190d40b689

This patch is sent to report an use after free in mem_cgroup_iter()
after
merging commit be2657752e9e ("mm: memcg: fix use after free in
mem_cgroup_iter()").

I work with android kernel tree (4.9 & 4.14), and commit be2657752e9e
("mm: memcg: fix use after free in mem_cgroup_iter()") has been merged
to
the trees.  However, I can still observe use after free issues addressed
in the commit be2657752e9e.  (on low-end devices, a few times this
month)

backtrace:
	css_tryget <- crash here
	mem_cgroup_iter
	shrink_node
	shrink_zones
	do_try_to_free_pages
	try_to_free_pages
	__perform_reclaim
	__alloc_pages_direct_reclaim
	__alloc_pages_slowpath
	__alloc_pages_nodemask

To debug, I poisoned mem_cgroup before freeing it:

static void __mem_cgroup_free(struct mem_cgroup *memcg)
	for_each_node(node)
	free_mem_cgroup_per_node_info(memcg, node);
	free_percpu(memcg->stat);
+       /* poison memcg before freeing it */
+       memset(memcg, 0x78, sizeof(struct mem_cgroup));
	kfree(memcg);
}

The coredump shows the position=0xdbbc2a00 is freed.

(gdb) p/x ((struct mem_cgroup_per_node *)0xe5009e00)->iter[8]
$13 = {position = 0xdbbc2a00, generation = 0x2efd}

0xdbbc2a00:     0xdbbc2e00      0x00000000      0xdbbc2800
0x00000100
0xdbbc2a10:     0x00000200      0x78787878      0x00026218
0x00000000
0xdbbc2a20:     0xdcad6000      0x00000001      0x78787800
0x00000000
0xdbbc2a30:     0x78780000      0x00000000      0x0068fb84
0x78787878
0xdbbc2a40:     0x78787878      0x78787878      0x78787878
0xe3fa5cc0
0xdbbc2a50:     0x78787878      0x78787878      0x00000000
0x00000000
0xdbbc2a60:     0x00000000      0x00000000      0x00000000
0x00000000
0xdbbc2a70:     0x00000000      0x00000000      0x00000000
0x00000000
0xdbbc2a80:     0x00000000      0x00000000      0x00000000
0x00000000
0xdbbc2a90:     0x00000001      0x00000000      0x00000000
0x00100000
0xdbbc2aa0:     0x00000001      0xdbbc2ac8      0x00000000
0x00000000
0xdbbc2ab0:     0x00000000      0x00000000      0x00000000
0x00000000
0xdbbc2ac0:     0x00000000      0x00000000      0xe5b02618
0x00001000
0xdbbc2ad0:     0x00000000      0x78787878      0x78787878
0x78787878
0xdbbc2ae0:     0x78787878      0x78787878      0x78787878
0x78787878
0xdbbc2af0:     0x78787878      0x78787878      0x78787878
0x78787878
0xdbbc2b00:     0x78787878      0x78787878      0x78787878
0x78787878
0xdbbc2b10:     0x78787878      0x78787878      0x78787878
0x78787878
0xdbbc2b20:     0x78787878      0x78787878      0x78787878
0x78787878
0xdbbc2b30:     0x78787878      0x78787878      0x78787878
0x78787878
0xdbbc2b40:     0x78787878      0x78787878      0x78787878
0x78787878
0xdbbc2b50:     0x78787878      0x78787878      0x78787878
0x78787878
0xdbbc2b60:     0x78787878      0x78787878      0x78787878
0x78787878
0xdbbc2b70:     0x78787878      0x78787878      0x78787878
0x78787878
0xdbbc2b80:     0x78787878      0x78787878      0x00000000
0x78787878
0xdbbc2b90:     0x78787878      0x78787878      0x78787878
0x78787878
0xdbbc2ba0:     0x78787878      0x78787878      0x78787878
0x78787878

In the reclaim path, try_to_free_pages() does not setup
sc.target_mem_cgroup and sc is passed to do_try_to_free_pages(), ...,
shrink_node().

In mem_cgroup_iter(), root is set to root_mem_cgroup because
sc->target_mem_cgroup is NULL.  It is possible to assign a memcg to
root_mem_cgroup.nodeinfo.iter in mem_cgroup_iter().

	try_to_free_pages
		struct scan_control sc = {...}, target_mem_cgroup is 0x0;
	do_try_to_free_pages
	shrink_zones
	shrink_node
		 mem_cgroup *root = sc->target_mem_cgroup;
		 memcg = mem_cgroup_iter(root, NULL, &reclaim);
	mem_cgroup_iter()
		if (!root)
			root = root_mem_cgroup;
		...

		css = css_next_descendant_pre(css, &root->css);
		memcg = mem_cgroup_from_css(css);
		cmpxchg(&iter->position, pos, memcg);

My device uses memcg non-hierarchical mode.  When we release a memcg:
invalidate_reclaim_iterators() reaches only dead_memcg and its parents.
If non-hierarchical mode is used, invalidate_reclaim_iterators() never
reaches root_mem_cgroup.

static void invalidate_reclaim_iterators(struct mem_cgroup *dead_memcg)
{
	struct mem_cgroup *memcg = dead_memcg;

	for (; memcg; memcg = parent_mem_cgroup(memcg)
	...
}

So the use after free scenario looks like:

CPU1						CPU2

try_to_free_pages
do_try_to_free_pages
shrink_zones
shrink_node
mem_cgroup_iter()
    if (!root)
    	root = root_mem_cgroup;
    ...
    css = css_next_descendant_pre(css, &root->css);
    memcg = mem_cgroup_from_css(css);
    cmpxchg(&iter->position, pos, memcg);

					invalidate_reclaim_iterators(memcg);
					...
					__mem_cgroup_free()
						kfree(memcg);

try_to_free_pages
do_try_to_free_pages
shrink_zones
shrink_node
mem_cgroup_iter()
    if (!root)
    	root = root_mem_cgroup;
    ...
    mz = mem_cgroup_nodeinfo(root, reclaim->pgdat->node_id);
    iter = &mz->iter[reclaim->priority];
    pos = READ_ONCE(iter->position);
    css_tryget(&pos->css) <- use after free

To avoid this, we should also invalidate root_mem_cgroup.nodeinfo.iter
in
invalidate_reclaim_iterators().

[cai@lca.pw: fix -Wparentheses compilation warning]
  Link:
http://lkml.kernel.org/r/1564580753-17531-1-git-send-email-cai@lca.pw
Link:
http://lkml.kernel.org/r/20190730015729.4406-1-miles.chen@mediatek.com
Fixes: 5ac8fb31ad2e ("mm: memcontrol: convert reclaim iterator to simple
css refcounting")
Signed-off-by: Miles Chen <miles.chen@mediatek.com>
Signed-off-by: Qian Cai <cai@lca.pw>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
---
 mm/memcontrol.c | 41 ++++++++++++++++++++++++++++++-----------
 1 file changed, 30 insertions(+), 11 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index fc10620967c7..c23adc7233af 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1001,28 +1001,47 @@ void mem_cgroup_iter_break(struct mem_cgroup
*root,
 		css_put(&prev->css);
 }
 
-static void invalidate_reclaim_iterators(struct mem_cgroup *dead_memcg)
+static void __invalidate_reclaim_iterators(struct mem_cgroup *from,
+					struct mem_cgroup *dead_memcg)
 {
-	struct mem_cgroup *memcg = dead_memcg;
 	struct mem_cgroup_reclaim_iter *iter;
 	struct mem_cgroup_per_zone *mz;
 	int nid, zid;
 	int i;
 
-	while ((memcg = parent_mem_cgroup(memcg))) {
-		for_each_node(nid) {
-			for (zid = 0; zid < MAX_NR_ZONES; zid++) {
-				mz = &memcg->nodeinfo[nid]->zoneinfo[zid];
-				for (i = 0; i <= DEF_PRIORITY; i++) {
-					iter = &mz->iter[i];
-					cmpxchg(&iter->position,
-						dead_memcg, NULL);
-				}
+	for_each_node(nid) {
+		for (zid = 0; zid < MAX_NR_ZONES; zid++) {
+			mz = &from->nodeinfo[nid]->zoneinfo[zid];
+			for (i = 0; i <= DEF_PRIORITY; i++) {
+				iter = &mz->iter[i];
+				cmpxchg(&iter->position,
+					dead_memcg, NULL);
 			}
 		}
 	}
 }
 
+static void invalidate_reclaim_iterators(struct mem_cgroup *dead_memcg)
+{
+	struct mem_cgroup *memcg = dead_memcg;
+	struct mem_cgroup *last;
+
+	do {
+		__invalidate_reclaim_iterators(memcg, dead_memcg);
+		last = memcg;
+	} while ((memcg = parent_mem_cgroup(memcg)));
+
+	/*
+	 * When cgruop1 non-hierarchy mode is used,
+	 * parent_mem_cgroup() does not walk all the way up to the
+	 * cgroup root (root_mem_cgroup). So we have to handle
+	 * dead_memcg from cgroup root separately.
+	 */
+	if (last != root_mem_cgroup)
+		__invalidate_reclaim_iterators(root_mem_cgroup,
+						dead_memcg);
+}
+
 /*
  * Iteration constructs for visiting all cgroups (under a tree).  If
  * loops are exited prematurely (break), mem_cgroup_iter_break() must
-- 
2.18.0




      reply	other threads:[~2019-08-16 11:40 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-16 10:04 FAILED: patch "[PATCH] mm/memcontrol.c: fix use after free in mem_cgroup_iter()" failed to apply to 4.4-stable tree gregkh
2019-08-16 11:39 ` Miles Chen [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1565955594.26404.6.camel@mtkswgap22 \
    --to=miles.chen@mediatek.com \
    --cc=akpm@linux-foundation.org \
    --cc=cai@lca.pw \
    --cc=gregkh@linuxfoundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=mhocko@suse.com \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.