From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.1 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4487BC433DF for ; Wed, 15 Jul 2020 01:45:05 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D584F20674 for ; Wed, 15 Jul 2020 01:45:04 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="kGOoLFo6" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D584F20674 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 040C16B0002; Tue, 14 Jul 2020 21:45:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id F0B736B0003; Tue, 14 Jul 2020 21:45:03 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DD2C68D0001; Tue, 14 Jul 2020 21:45:03 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0187.hostedemail.com [216.40.44.187]) by kanga.kvack.org (Postfix) with ESMTP id C0E9E6B0002 for ; Tue, 14 Jul 2020 21:45:03 -0400 (EDT) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 709AE1EF1 for ; Wed, 15 Jul 2020 01:45:03 +0000 (UTC) X-FDA: 77038616886.24.oil57_330d7a426ef5 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin24.hostedemail.com (Postfix) with ESMTP id 4399A1A4A0 for ; Wed, 15 Jul 2020 01:45:03 +0000 (UTC) X-HE-Tag: oil57_330d7a426ef5 X-Filterd-Recvd-Size: 7428 Received: from mail-io1-f66.google.com (mail-io1-f66.google.com [209.85.166.66]) by imf39.hostedemail.com (Postfix) with ESMTP for ; Wed, 15 Jul 2020 01:45:02 +0000 (UTC) Received: by mail-io1-f66.google.com with SMTP id q74so551641iod.1 for ; Tue, 14 Jul 2020 18:45:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=YSxfX33W9K7l9ai5DRr/QL/qPmam8+VuibhGpl5G3o4=; b=kGOoLFo6D3+q2Lp6wnyKKEQQscBjq4cFjspYPthP9JMbVzMlY6g6PcG/oVCs1fnY1f 9e6xHmXZxPgYvZaW5dvshqtGCwJIUvzQnXp1whkPc52vdZ5XmurWZ5y1/JcfOLOZEZWI eTTbNfRIW/MpUnT6hNClJE3ipIxtXHP8vw9d+78TiHwtljdMPOflfelAERviZNjMwRK6 iTDPn+zvu+pOURaWo/nhof2TBKgYV7Rlv/0rHHzqbieAXViD+yfETuoWr7FwL83j+Mmz Cd7MNDoJbSRGwMUG/iVUYhSDdhRGa6C108uIbDFG073f9s4x41krJGIXVRfZdu0jaOIB ne+A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=YSxfX33W9K7l9ai5DRr/QL/qPmam8+VuibhGpl5G3o4=; b=iFwEPLGj/NOi1bvgXe0vQTsDzkt7TaMWQSek9tY5+UsqTmjSJq1irb5xfxtdTX0aex kfF6lRn1oUjmJyj9W35QIund+ZKY2iuabdJWei41PaS2MUQHU0Ox74fd4PaqwkJcS8w2 M+MObHn2Xx9oK9P7UC/kU+TQ7qaAmKWClLynrq6gQfzBk47JDs88o9vYI8jCoRLzaDXQ 05gRMLXK4dMdG1d2oi5nJpWFLYpsN/GMyhnrEkflckum4ZZGQbC9rWW6t7B0KXwO63WI HXn2DSZgReleweFF6YgG7Kurzz2d12vy4xsYFaTPNXWoqFlyiLz7m5fHdsiARQ2h/6FN kRTQ== X-Gm-Message-State: AOAM530HVpTATCUqEBSzxS5b5heUrzdP4a4dUrKShmDlfbyCxG+xzUJp EYkI7dyVbLycBSGnk55whvJ5aborS1I03wlDChI= X-Google-Smtp-Source: ABdhPJwFdfU3rDYsxOh3bBpXgqrTPyuV1kzcReQicrv6pcqHhkJbpHqo2NU6+zDrPY4W4ifzlE3eGlyFWTix8NG2z84= X-Received: by 2002:a05:6638:250f:: with SMTP id v15mr9039540jat.97.1594777502151; Tue, 14 Jul 2020 18:45:02 -0700 (PDT) MIME-Version: 1.0 References: <1594735034-19190-1-git-send-email-laoar.shao@gmail.com> In-Reply-To: From: Yafang Shao Date: Wed, 15 Jul 2020 09:44:26 +0800 Message-ID: Subject: Re: [PATCH v2] memcg, oom: check memcg margin for parallel oom To: David Rientjes Cc: Michal Hocko , Tetsuo Handa , Andrew Morton , Johannes Weiner , Linux MM Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 4399A1A4A0 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam02 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Jul 15, 2020 at 2:46 AM David Rientjes wrote: > > On Tue, 14 Jul 2020, Yafang Shao wrote: > > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > > index 1962232..15e0e18 100644 > > --- a/mm/memcontrol.c > > +++ b/mm/memcontrol.c > > @@ -1560,15 +1560,21 @@ static bool mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask, > > .gfp_mask = gfp_mask, > > .order = order, > > }; > > - bool ret; > > + bool ret = true; > > > > if (mutex_lock_killable(&oom_lock)) > > return true; > > + > > + if (mem_cgroup_margin(memcg) >= (1 << order)) > > + goto unlock; > > + > > /* > > * A few threads which were not waiting at mutex_lock_killable() can > > * fail to bail out. Therefore, check again after holding oom_lock. > > */ > > ret = should_force_charge() || out_of_memory(&oc); > > + > > +unlock: > > mutex_unlock(&oom_lock); > > return ret; > > } > > Hi Yafang, > > We've run with a patch very much like this for several years and it works > quite successfully to prevent the unnecessary oom killing of processes. > > We do this in out_of_memory() directly, however, because we found that we > could prevent even *more* unnecessary killing if we checked this at the > "point of no return" because the selection of processes takes some > additional time when we might resolve the oom condition. > Hi David, Your proposal could also resolve the issue, but I'm wondering why do it specifically for memcg oom? Doesn't it apply to global oom? For example, in the global oom, when selecting the processes, the others might free some pages and then it might allocate pages successfully. > Some may argue that this is unnecessarily exposing mem_cgroup_margin() to > generic mm code, but in the interest of preventing any unnecessary oom > kill we've found it to be helpful. > > I proposed a variant of this in https://lkml.org/lkml/2020/3/11/1089. > > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h > --- a/include/linux/memcontrol.h > +++ b/include/linux/memcontrol.h > @@ -798,6 +798,8 @@ static inline void memcg_memory_event_mm(struct mm_struct *mm, > void mem_cgroup_split_huge_fixup(struct page *head); > #endif > > +unsigned long mem_cgroup_margin(struct mem_cgroup *memcg); > + > #else /* CONFIG_MEMCG */ > > #define MEM_CGROUP_ID_SHIFT 0 > @@ -825,6 +827,10 @@ static inline void memcg_memory_event_mm(struct mm_struct *mm, > { > } > > +static inline unsigned long mem_cgroup_margin(struct mem_cgroup *memcg) > +{ > +} > + > static inline unsigned long mem_cgroup_protection(struct mem_cgroup *memcg, > bool in_low_reclaim) > { > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -1282,7 +1282,7 @@ void mem_cgroup_update_lru_size(struct lruvec *lruvec, enum lru_list lru, > * Returns the maximum amount of memory @mem can be charged with, in > * pages. > */ > -static unsigned long mem_cgroup_margin(struct mem_cgroup *memcg) > +unsigned long mem_cgroup_margin(struct mem_cgroup *memcg) > { > unsigned long margin = 0; > unsigned long count; > diff --git a/mm/oom_kill.c b/mm/oom_kill.c > --- a/mm/oom_kill.c > +++ b/mm/oom_kill.c > @@ -1109,9 +1109,23 @@ bool out_of_memory(struct oom_control *oc) > if (!is_sysrq_oom(oc) && !is_memcg_oom(oc)) > panic("System is deadlocked on memory\n"); > } > - if (oc->chosen && oc->chosen != (void *)-1UL) > + if (oc->chosen && oc->chosen != (void *)-1UL) { > + if (is_memcg_oom(oc)) { > + /* > + * If a memcg is now under its limit or current will be > + * exiting and freeing memory, avoid needlessly killing > + * chosen. > + */ > + if (mem_cgroup_margin(oc->memcg) >= (1 << oc->order) || > + task_will_free_mem(current)) { > + put_task_struct(oc->chosen); > + return true; > + } > + } > + > oom_kill_process(oc, !is_memcg_oom(oc) ? "Out of memory" : > "Memory cgroup out of memory"); > + } > return !!oc->chosen; > } > -- Thanks Yafang