From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.0 required=3.0 tests=BAYES_00,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1B9E3C433E6 for ; Mon, 13 Jul 2020 12:45:16 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C1B2220771 for ; Mon, 13 Jul 2020 12:45:15 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C1B2220771 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 270818D0005; Mon, 13 Jul 2020 08:45:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 220088D0001; Mon, 13 Jul 2020 08:45:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 136188D0005; Mon, 13 Jul 2020 08:45:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0224.hostedemail.com [216.40.44.224]) by kanga.kvack.org (Postfix) with ESMTP id F18238D0001 for ; Mon, 13 Jul 2020 08:45:14 -0400 (EDT) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 908FA181AEF07 for ; Mon, 13 Jul 2020 12:45:14 +0000 (UTC) X-FDA: 77033022948.06.coil18_4d0965026ee8 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin06.hostedemail.com (Postfix) with ESMTP id 480401005013E for ; Mon, 13 Jul 2020 12:45:07 +0000 (UTC) X-HE-Tag: coil18_4d0965026ee8 X-Filterd-Recvd-Size: 5471 Received: from mail-ej1-f66.google.com (mail-ej1-f66.google.com [209.85.218.66]) by imf29.hostedemail.com (Postfix) with ESMTP for ; Mon, 13 Jul 2020 12:45:06 +0000 (UTC) Received: by mail-ej1-f66.google.com with SMTP id br7so3128325ejb.5 for ; Mon, 13 Jul 2020 05:45:06 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=qQ6C4LqxDmloF8Zx5JgM9p00SYvGJa/JeFlI0JEvaxI=; b=kWiSs1HoSOWwl9TmXlFlzMMGKlRPeM4ZBmOPhklEHMylXAFr2OCIHw57nlbFYhX5AW cwybqnatrhLdJStwcMRZ+pMKTPmjZZa7p9UB62S29rO/3ChQH4F2DNNiQ03hk06qbQ0m Ch+3ozcTInmXpDwqGVUBwiczbNt0VPUShmc0A/Ti0pNfkMEqiEfCLBL+ljCm2wEXp++8 dg33L+4jzA30dKgHCv5LogVdM3yxRe6Dba13RwMxBJIvbQJ6yz2N+2udwiwuIIeGUGgS +82u/aDBUlSbsAtOxD5W7qcLdhHyum7gfE7a/EH5iNSQgFvnTXgoWITApYI44NG/wDwc 7A8A== X-Gm-Message-State: AOAM532RnZ827uUsSo8JsFaMrCsp6PdlxXgNB41QZsEs7iRHq4X6VnD+ wy/FmC3Ga3qSSFzfdPhNHGg= X-Google-Smtp-Source: ABdhPJzrUGGoWT2jcaN/ynP+iVaQ3nCFKyYWp14hqfaIiT8+R/t9Dr/YzF/YrpcczUYDNtkeXNdtfg== X-Received: by 2002:a17:906:70cf:: with SMTP id g15mr54090320ejk.531.1594644305668; Mon, 13 Jul 2020 05:45:05 -0700 (PDT) Received: from localhost (ip-37-188-148-171.eurotel.cz. [37.188.148.171]) by smtp.gmail.com with ESMTPSA id a1sm9492543ejk.125.2020.07.13.05.45.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Jul 2020 05:45:04 -0700 (PDT) Date: Mon, 13 Jul 2020 14:45:03 +0200 From: Michal Hocko To: Yafang Shao Cc: David Rientjes , Andrew Morton , Linux MM Subject: Re: [PATCH] mm, oom: don't invoke oom killer if current has been reapered Message-ID: <20200713124503.GF16783@dhcp22.suse.cz> References: <1594437481-11144-1-git-send-email-laoar.shao@gmail.com> <20200713060154.GA16783@dhcp22.suse.cz> <20200713062132.GB16783@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Queue-Id: 480401005013E X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam02 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon 13-07-20 20:24:07, Yafang Shao wrote: > On Mon, Jul 13, 2020 at 2:21 PM Michal Hocko wrote: > > > > On Mon 13-07-20 08:01:57, Michal Hocko wrote: > > > On Fri 10-07-20 23:18:01, Yafang Shao wrote: > > [...] > > > > There're many threads of a multi-threaded task parallel running in a > > > > container on many cpus. Then many threads triggered OOM at the same time, > > > > > > > > CPU-1 CPU-2 ... CPU-n > > > > thread-1 thread-2 ... thread-n > > > > > > > > wait oom_lock wait oom_lock ... hold oom_lock > > > > > > > > (sigkill received) > > > > > > > > select current as victim > > > > and wakeup oom reaper > > > > > > > > release oom_lock > > > > > > > > (MMF_OOM_SKIP set by oom reaper) > > > > > > > > (lots of pages are freed) > > > > hold oom_lock > > > > > > Could you be more specific please? The page allocator never waits for > > > the oom_lock and keeps retrying instead. Also __alloc_pages_may_oom > > > tries to allocate with the lock held. > > > > I suspect that you are looking at memcg oom killer. > > Right, these threads were waiting the oom_lock in mem_cgroup_out_of_memory(). > > > Because we do not do > > trylock there for some reason I do not immediatelly remember from top of > > my head. If this is really the case then I would recommend looking into > > how the page allocator implements this and follow the same pattern for > > memcg as well. > > > > That is a good suggestion. > But we can't try locking the global oom_lock here, because task ooming > in memcg foo may can't help the tasks in memcg bar. I do not follow. oom_lock is not about fwd progress. It is a big lock to synchronize against oom_disable logic. I have this in mind diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 248e6cad0095..29d1f8c2d968 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1563,8 +1563,10 @@ static bool mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask, }; bool ret; - if (mutex_lock_killable(&oom_lock)) + if (!mutex_trylock(&oom_lock)) return true; + + /* * A few threads which were not waiting at mutex_lock_killable() can * fail to bail out. Therefore, check again after holding oom_lock. But as I've said I would need to double check the history on why we differ here. Btw. I suspect that mem_cgroup_out_of_memory call in mem_cgroup_oom_synchronize is bogus and can no longer trigger after 29ef680ae7c21 but this needs double checking as well. > IOW, we need to introduce the per memcg oom_lock, like bellow, I do not see why. Besides that we already do have per oom memcg hierarchy lock. -- Michal Hocko SUSE Labs