From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9930DC388F2 for ; Mon, 2 Nov 2020 01:07:36 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 693F0206D4 for ; Mon, 2 Nov 2020 01:07:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1604279256; bh=Rvybe/Rx/O6ZTeDLUGOCMcvuzDgq2U09DPFIWAqvfeY=; h=Date:From:To:Subject:In-Reply-To:Reply-To:List-ID:From; b=p5tWT9cOv7Im5KRF0poHoOtCgcW2cAzLSsMWP6dnVJWMWy5cf8hDhvK7RS7RcznfM x7KKoWo7Wt35TAXcwf5iuzIuE+BzVxUZzmFKjK9aM5wk/oM7qVjFTTQN2bEYTd0g8T OjyVqJMkUfjJmIWfDV041uCLYeJOXOrBFEsPfiTg= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727461AbgKBBHg (ORCPT ); Sun, 1 Nov 2020 20:07:36 -0500 Received: from mail.kernel.org ([198.145.29.99]:48918 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727335AbgKBBHf (ORCPT ); Sun, 1 Nov 2020 20:07:35 -0500 Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id D5DC7206E5; Mon, 2 Nov 2020 01:07:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1604279255; bh=Rvybe/Rx/O6ZTeDLUGOCMcvuzDgq2U09DPFIWAqvfeY=; h=Date:From:To:Subject:In-Reply-To:From; b=oaBEqRsNfKDdN7QduktQLBTTLCD90o4wHXMiRWfPv+HATVlOFT0LAk2VQhroIAyNh bI85dooajjc/ruJjXnEWQKvcqRZ4TcqcWRWiyvCAihPgmd1GURtHqkYBxbk/mNO0so KzFxgI6qXmddQz0qgjOgHsZNp7J2oo/sU2V7ZK+8= Date: Sun, 01 Nov 2020 17:07:34 -0800 From: Andrew Morton To: akpm@linux-foundation.org, guro@fb.com, hannes@cmpxchg.org, linux-mm@kvack.org, ltp@lists.linux.it, mhocko@kernel.org, mkoutny@suse.com, mm-commits@vger.kernel.org, rpalethorpe@suse.com, shakeelb@google.com, stable@vger.kernel.org, torvalds@linux-foundation.org Subject: [patch 04/15] mm: memcg: link page counters to root if use_hierarchy is false Message-ID: <20201102010734.L07vYi-Dp%akpm@linux-foundation.org> In-Reply-To: <20201101170656.48abbd5e88375219f868af5e@linux-foundation.org> User-Agent: s-nail v14.8.16 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Precedence: bulk Reply-To: linux-kernel@vger.kernel.org List-ID: X-Mailing-List: mm-commits@vger.kernel.org =46rom: Roman Gushchin Subject: mm: memcg: link page counters to root if use_hierarchy is false Richard reported a warning which can be reproduced by running the LTP madvise6 test (cgroup v1 in the non-hierarchical mode should be used): [ 9.841552] ------------[ cut here ]------------ [ 9.841788] WARNING: CPU: 0 PID: 12 at mm/page_counter.c:57 page_counter= _uncharge (mm/page_counter.c:57 mm/page_counter.c:50 mm/page_counter.c:156) [ 9.841982] Modules linked in: [ 9.842072] CPU: 0 PID: 12 Comm: kworker/0:1 Not tainted 5.9.0-rc7-22-de= fault #77 [ 9.842266] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS = rel-1.13.0-48-gd9c812d-rebuilt.opensuse.org 04/01/2014 [ 9.842571] Workqueue: events drain_local_stock [ 9.842750] RIP: 0010:page_counter_uncharge (mm/page_counter.c:57 mm/pag= e_counter.c:50 mm/page_counter.c:156) [ 9.842894] Code: 0f c1 45 00 4c 29 e0 48 89 ef 48 89 c3 48 89 c6 e8 2a fe = ff ff 48 85 db 78 10 48 8b 6d 28 48 85 ed 75 d8 5b 5d 41 5c 41 5d c3 <0f> 0= b eb ec 90 e8 4b f9 88 2a 48 8b 17 48 39 d6 72 41 41 54 49 89 [ 9.843438] RSP: 0018:ffffb1c18006be28 EFLAGS: 00010086 [ 9.843585] RAX: ffffffffffffffff RBX: ffffffffffffffff RCX: ffff94803bc= 2cae0 [ 9.843806] RDX: 0000000000000001 RSI: ffffffffffffffff RDI: ffff948007d= 2b248 [ 9.844026] RBP: ffff948007d2b248 R08: ffff948007c58eb0 R09: ffff948007d= a05ac [ 9.844248] R10: 0000000000000018 R11: 0000000000000018 R12: 00000000000= 00001 [ 9.844477] R13: ffffffffffffffff R14: 0000000000000000 R15: ffff94803bc= 2cac0 [ 9.844696] FS: 0000000000000000(0000) GS:ffff94803bc00000(0000) knlGS:= 0000000000000000 [ 9.844915] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 9.845096] CR2: 00007f0579ee0384 CR3: 000000002cc0a000 CR4: 00000000000= 006f0 [ 9.845319] Call Trace: [ 9.845429] __memcg_kmem_uncharge (mm/memcontrol.c:3022) [ 9.845582] drain_obj_stock (./include/linux/rcupdate.h:689 mm/memcontro= l.c:3114) [ 9.845684] drain_local_stock (mm/memcontrol.c:2255) [ 9.845789] process_one_work (./arch/x86/include/asm/jump_label.h:25 ./i= nclude/linux/jump_label.h:200 ./include/trace/events/workqueue.h:108 kernel= /workqueue.c:2274) [ 9.845898] worker_thread (./include/linux/list.h:282 kernel/workqueue.c= :2416) [ 9.846034] ? process_one_work (kernel/workqueue.c:2358) [ 9.846162] kthread (kernel/kthread.c:292) [ 9.846271] ? __kthread_bind_mask (kernel/kthread.c:245) [ 9.846420] ret_from_fork (arch/x86/entry/entry_64.S:300) [ 9.846531] ---[ end trace 8b5647c1eba9d18a ]--- The problem occurs because in the non-hierarchical mode non-root page counters are not linked to root page counters, so the charge is not propagated to the root memory cgroup. After the removal of the original memory cgroup and reparenting of the object cgroup, the root cgroup might be uncharged by draining a objcg stock, for example. It leads to an eventual underflow of the charge and triggers a warning. Fix it by linking all page counters to corresponding root page counters in the non-hierarchical mode. Please note, that in the non-hierarchical mode all objcgs are always reparented to the root memory cgroup, even if the hierarchy has more than 1 level. This patch doesn't change it. The patch also doesn't affect how the hierarchical mode is working, which is the only sane and truly supported mode now. Thanks to Richard for reporting, debugging and providing an alternative version of the fix! Link: https://lkml.kernel.org/r/20201026231326.3212225-1-guro@fb.com Fixes: bf4f059954dc ("mm: memcg/slab: obj_cgroup API") Signed-off-by: Roman Gushchin Debugged-by: Richard Palethorpe Reported-by: Reviewed-by: Shakeel Butt Acked-by: Johannes Weiner Reviewed-by: Michal Koutn=C3=BD Cc: Michal Hocko Cc: Signed-off-by: Andrew Morton --- mm/memcontrol.c | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-) --- a/mm/memcontrol.c~mm-memcg-link-page-counters-to-root-if-use_hierarchy-= is-false +++ a/mm/memcontrol.c @@ -5345,17 +5345,22 @@ mem_cgroup_css_alloc(struct cgroup_subsy memcg->swappiness =3D mem_cgroup_swappiness(parent); memcg->oom_kill_disable =3D parent->oom_kill_disable; } - if (parent && parent->use_hierarchy) { + if (!parent) { + page_counter_init(&memcg->memory, NULL); + page_counter_init(&memcg->swap, NULL); + page_counter_init(&memcg->kmem, NULL); + page_counter_init(&memcg->tcpmem, NULL); + } else if (parent->use_hierarchy) { memcg->use_hierarchy =3D true; page_counter_init(&memcg->memory, &parent->memory); page_counter_init(&memcg->swap, &parent->swap); page_counter_init(&memcg->kmem, &parent->kmem); page_counter_init(&memcg->tcpmem, &parent->tcpmem); } else { - page_counter_init(&memcg->memory, NULL); - page_counter_init(&memcg->swap, NULL); - page_counter_init(&memcg->kmem, NULL); - page_counter_init(&memcg->tcpmem, NULL); + page_counter_init(&memcg->memory, &root_mem_cgroup->memory); + page_counter_init(&memcg->swap, &root_mem_cgroup->swap); + page_counter_init(&memcg->kmem, &root_mem_cgroup->kmem); + page_counter_init(&memcg->tcpmem, &root_mem_cgroup->tcpmem); /* * Deeper hierachy with use_hierarchy =3D=3D false doesn't make * much sense so let cgroup subsystem know about this _