mm-commits Archive on lore.kernel.org
 help / color / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: akpm@linux-foundation.org, brookxu@tencent.com,
	hannes@cmpxchg.org, kirill.shutemov@linux.intel.com,
	linux-mm@kvack.org, mhocko@suse.com, mm-commits@vger.kernel.org,
	stable@vger.kernel.org, torvalds@linux-foundation.org,
	vdavydov.dev@gmail.com
Subject: [patch 01/10] memcg: fix NULL pointer dereference in __mem_cgroup_usage_unregister_event
Date: Sat, 21 Mar 2020 18:22:10 -0700
Message-ID: <20200322012210.Ju8CEM46J%akpm@linux-foundation.org> (raw)
In-Reply-To: <20200321181954.c0564dfd5514cd742b534884@linux-foundation.org>


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #0: Type: text/plain; charset=utf-8, Size: 4974 bytes --]

From: Chunguang Xu <brookxu@tencent.com>
Subject: memcg: fix NULL pointer dereference in __mem_cgroup_usage_unregister_event

An eventfd monitors multiple memory thresholds of the cgroup, closes them,
the kernel deletes all events related to this eventfd.  Before all events
are deleted, another eventfd monitors the memory threshold of this cgroup,
leading to a crash:

[135.675108] BUG: kernel NULL pointer dereference, address: 0000000000000004
[135.675350] #PF: supervisor write access in kernel mode
[135.675579] #PF: error_code(0x0002) - not-present page
[135.675816] PGD 800000033058e067 P4D 800000033058e067 PUD 3355ce067 PMD 0
[135.676080] Oops: 0002 [#1] SMP PTI
[135.676332] CPU: 2 PID: 14012 Comm: kworker/2:6 Kdump: loaded Not tainted 5.6.0-rc4 #3
[135.676610] Hardware name: LENOVO 20AWS01K00/20AWS01K00, BIOS GLET70WW (2.24 ) 05/21/2014
[135.676909] Workqueue: events memcg_event_remove
[135.677192] RIP: 0010:__mem_cgroup_usage_unregister_event+0xb3/0x190
[135.677825] RSP: 0018:ffffb47e01c4fe18 EFLAGS: 00010202
[135.678186] RAX: 0000000000000001 RBX: ffff8bb223a8a000 RCX: 0000000000000001
[135.678548] RDX: 0000000000000001 RSI: ffff8bb22fb83540 RDI: 0000000000000001
[135.678912] RBP: ffffb47e01c4fe48 R08: 0000000000000000 R09: 0000000000000010
[135.679287] R10: 000000000000000c R11: 071c71c71c71c71c R12: ffff8bb226aba880
[135.679670] R13: ffff8bb223a8a480 R14: 0000000000000000 R15: 0000000000000000
[135.680066] FS:  0000000000000000(0000) GS:ffff8bb242680000(0000) knlGS:0000000000000000
[135.680475] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[135.680894] CR2: 0000000000000004 CR3: 000000032c29c003 CR4: 00000000001606e0
[135.681325] Call Trace:
[135.681763]  memcg_event_remove+0x32/0x90
[135.682209]  process_one_work+0x172/0x380
[135.682657]  worker_thread+0x49/0x3f0
[135.683111]  kthread+0xf8/0x130
[135.683570]  ? max_active_store+0x80/0x80
[135.684034]  ? kthread_bind+0x10/0x10
[135.684506]  ret_from_fork+0x35/0x40
[135.689733] CR2: 0000000000000004

We can reproduce this problem in the following ways:

1. We create a new cgroup subdirectory and a new eventfd, and then we
   monitor multiple memory thresholds of the cgroup through this eventfd.

2.  closing this eventfd, and __mem_cgroup_usage_unregister_event ()
   will be called multiple times to delete all events related to this
   eventfd.

The first time __mem_cgroup_usage_unregister_event() is called, the kernel
will clear all items related to this eventfd in thresholds-> primary.Since
there is currently only one eventfd, thresholds-> primary becomes empty,
so the kernel will set thresholds-> primary and hresholds-> spare to NULL.
If at this time, the user creates a new eventfd and monitor the memory
threshold of this cgroup, kernel will re-initialize thresholds-> primary. 
Then when __mem_cgroup_usage_unregister_event () is called for the second
time, because thresholds-> primary is not empty, the system will access
thresholds-> spare, but thresholds-> spare is NULL, which will trigger a
crash.

In general, the longer it takes to delete all events related to this
eventfd, the easier it is to trigger this problem.

The solution is to check whether the thresholds associated with the
eventfd has been cleared when deleting the event.  If so, we do nothing.

[akpm@linux-foundation.org: fix comment, per Kirill]
Link: http://lkml.kernel.org/r/077a6f67-aefa-4591-efec-f2f3af2b0b02@gmail.com
Fixes: 907860ed381a ("cgroups: make cftype.unregister_event() void-returning")
Signed-off-by: Chunguang Xu <brookxu@tencent.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memcontrol.c |   10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

--- a/mm/memcontrol.c~memcg-fix-null-pointer-dereference-in-__mem_cgroup_usage_unregister_event
+++ a/mm/memcontrol.c
@@ -4027,7 +4027,7 @@ static void __mem_cgroup_usage_unregiste
 	struct mem_cgroup_thresholds *thresholds;
 	struct mem_cgroup_threshold_ary *new;
 	unsigned long usage;
-	int i, j, size;
+	int i, j, size, entries;
 
 	mutex_lock(&memcg->thresholds_lock);
 
@@ -4047,14 +4047,20 @@ static void __mem_cgroup_usage_unregiste
 	__mem_cgroup_threshold(memcg, type == _MEMSWAP);
 
 	/* Calculate new number of threshold */
-	size = 0;
+	size = entries = 0;
 	for (i = 0; i < thresholds->primary->size; i++) {
 		if (thresholds->primary->entries[i].eventfd != eventfd)
 			size++;
+		else
+			entries++;
 	}
 
 	new = thresholds->spare;
 
+	/* If no items related to eventfd have been cleared, nothing to do */
+	if (!entries)
+		goto unlock;
+
 	/* Set thresholds array to NULL if we don't have thresholds */
 	if (!size) {
 		kfree(new);
_

  reply index

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-22  1:19 incoming Andrew Morton
2020-03-22  1:22 ` Andrew Morton [this message]
2020-03-22  1:22 ` [patch 02/10] mm/hotplug: fix hot remove failure in SPARSEMEM|!VMEMMAP case Andrew Morton
2020-03-22  1:22 ` [patch 03/10] page-flags: fix a crash at SetPageError(THP_SWAP) Andrew Morton
2020-03-22  1:22 ` [patch 04/10] mm, memcg: fix corruption on 64-bit divisor in memory.high throttling Andrew Morton
2020-03-22  1:22 ` [patch 05/10] mm, memcg: throttle allocators based on ancestral memory.high Andrew Morton
2020-03-22  1:22 ` [patch 06/10] mm: do not allow MADV_PAGEOUT for CoW pages Andrew Morton
2020-03-22  1:22 ` [patch 07/10] epoll: fix possible lost wakeup on epoll_ctl() path Andrew Morton
2020-03-22  1:22 ` [patch 08/10] mm/mmu_notifier: silence PROVE_RCU_LIST warnings Andrew Morton
2020-03-22  1:22 ` [patch 09/10] mm, slub: prevent kmalloc_node crashes and memory leaks Andrew Morton
2020-03-22  1:22 ` [patch 10/10] x86/mm: split vmalloc_sync_all() Andrew Morton
2020-03-22  1:39 ` + tools-testing-selftests-vm-mlock2-tests-fix-mlock2-false-negative-errors.patch added to -mm tree Andrew Morton
2020-03-22  4:39 ` + libfs-fix-infoleak-in-simple_attr_read.patch " Andrew Morton
2020-03-22  4:41 ` + bus-mhi-fix-printk-format-for-size_t.patch " Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200322012210.Ju8CEM46J%akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=brookxu@tencent.com \
    --cc=hannes@cmpxchg.org \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=mm-commits@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

mm-commits Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/mm-commits/0 mm-commits/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 mm-commits mm-commits/ https://lore.kernel.org/mm-commits \
		mm-commits@vger.kernel.org
	public-inbox-index mm-commits

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.mm-commits


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git