From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 56BA9C43331 for ; Thu, 2 Apr 2020 04:07:42 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1C0BA20784 for ; Thu, 2 Apr 2020 04:07:42 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="iOk3Kgel" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1C0BA20784 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id C07428E0050; Thu, 2 Apr 2020 00:07:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BB7CD8E000D; Thu, 2 Apr 2020 00:07:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B1C388E0050; Thu, 2 Apr 2020 00:07:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0019.hostedemail.com [216.40.44.19]) by kanga.kvack.org (Postfix) with ESMTP id 998888E000D for ; Thu, 2 Apr 2020 00:07:41 -0400 (EDT) Received: from smtpin08.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 66B2945B3 for ; Thu, 2 Apr 2020 04:07:41 +0000 (UTC) X-FDA: 76661581122.08.road50_431a8ba97d129 X-HE-Tag: road50_431a8ba97d129 X-Filterd-Recvd-Size: 3594 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf05.hostedemail.com (Postfix) with ESMTP for ; Thu, 2 Apr 2020 04:07:40 +0000 (UTC) Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id DE5CB20747; Thu, 2 Apr 2020 04:07:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1585800460; bh=DEJpwoYyUomVCdFM6ZX35a0cayFXpplNbi6g+DeKrgY=; h=Date:From:To:Subject:In-Reply-To:From; b=iOk3Kgelikp5VL2/DWwd35nqnur/ELpjUokLp/1cFRI2tpUsDfaDxwqxfI2mJ6I/s Sy5h3B5T70gQJ4rjBpWGJzk+0ubq4iRiGFsFksu0nXIs6B2iqJC4DpyW/h8r4Y2hLG aQL6AZ88gyZF/L6ksPwHdf+9sZ8i5YTGlI5n8dSA= Date: Wed, 01 Apr 2020 21:07:39 -0700 From: Andrew Morton To: akpm@linux-foundation.org, dschatzberg@fb.com, guro@fb.com, hannes@cmpxchg.org, linux-mm@kvack.org, mhocko@suse.com, mm-commits@vger.kernel.org, torvalds@linux-foundation.org Subject: [patch 080/155] mm: memcg: make memory.oom.group tolerable to task migration Message-ID: <20200402040739.PB5HXuW7o%akpm@linux-foundation.org> In-Reply-To: <20200401210155.09e3b9742e1c6e732f5a7250@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Roman Gushchin Subject: mm: memcg: make memory.oom.group tolerable to task migration If a task is getting moved out of the OOMing cgroup, it might result in unexpected OOM killings if memory.oom.group is used anywhere in the cgroup tree. Imagine the following example: A (oom.group = 1) / \ (OOM) B C Let's say B's memory.max is exceeded and it's OOMing. The OOM killer selects a task in B as a victim, but someone asynchronously moves the task into C. mem_cgroup_get_oom_group() will iterate over all ancestors of C up to the root cgroup. In theory it had to stop at the oom_domain level - the memory cgroup which is OOMing. But because B is not an ancestor of C, it's not happening. Instead it chooses A (because it's oom.group is set), and kills all tasks in A. This behavior is wrong because the OOM happened in B, so there is no reason to kill anything outside. Fix this by checking it the memory cgroup to which the task belongs is a descendant of the oom_domain. If not, memory.oom.group should be ignored, and the OOM killer should kill only the victim task. Link: http://lkml.kernel.org/r/20200316223510.3176148-1-guro@fb.com Signed-off-by: Roman Gushchin Reported-by: Dan Schatzberg Acked-by: Michal Hocko Acked-by: Johannes Weiner Signed-off-by: Andrew Morton --- mm/memcontrol.c | 8 ++++++++ 1 file changed, 8 insertions(+) --- a/mm/memcontrol.c~mm-memcg-make-memoryoomgroup-tolerable-to-task-migration +++ a/mm/memcontrol.c @@ -1931,6 +1931,14 @@ struct mem_cgroup *mem_cgroup_get_oom_gr goto out; /* + * If the victim task has been asynchronously moved to a different + * memory cgroup, we might end up killing tasks outside oom_domain. + * In this case it's better to ignore memory.group.oom. + */ + if (unlikely(!mem_cgroup_is_descendant(memcg, oom_domain))) + goto out; + + /* * Traverse the memory cgroup hierarchy from the victim task's * cgroup up to the OOMing cgroup (or root) to find the * highest-level memory cgroup with oom.group set. _