From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753498AbdJMHAF (ORCPT ); Fri, 13 Oct 2017 03:00:05 -0400 Received: from mx2.suse.de ([195.135.220.15]:52619 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751503AbdJMHAD (ORCPT ); Fri, 13 Oct 2017 03:00:03 -0400 Date: Fri, 13 Oct 2017 09:00:01 +0200 From: Michal Hocko To: Johannes Weiner Cc: Greg Thelen , Shakeel Butt , Alexander Viro , Vladimir Davydov , Andrew Morton , Linux MM , linux-fsdevel@vger.kernel.org, LKML Subject: Re: [PATCH] fs, mm: account filp and names caches to kmemcg Message-ID: <20171013070001.mglwdzdrqjt47clz@dhcp22.suse.cz> References: <20171006075900.icqjx5rr7hctn3zd@dhcp22.suse.cz> <20171009062426.hmqedtqz5hkmhnff@dhcp22.suse.cz> <20171009202613.GA15027@cmpxchg.org> <20171010091430.giflzlayvjblx5bu@dhcp22.suse.cz> <20171010141733.GB16710@cmpxchg.org> <20171010142434.bpiqmsbb7gttrlcb@dhcp22.suse.cz> <20171012190312.GA5075@cmpxchg.org> <20171013063555.pa7uco43mod7vrkn@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20171013063555.pa7uco43mod7vrkn@dhcp22.suse.cz> User-Agent: NeoMutt/20170609 (1.8.3) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Just to be explicit what I've had in mind. This hasn't been even compile tested but it should provide at least an idea where I am trying to go.. --- diff --git a/mm/memcontrol.c b/mm/memcontrol.c index d5f3a62887cf..91fa05372114 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1528,26 +1528,36 @@ static void memcg_oom_recover(struct mem_cgroup *memcg) static void mem_cgroup_oom(struct mem_cgroup *memcg, gfp_t mask, int order) { - if (!current->memcg_may_oom) - return; /* * We are in the middle of the charge context here, so we * don't want to block when potentially sitting on a callstack * that holds all kinds of filesystem and mm locks. * - * Also, the caller may handle a failed allocation gracefully - * (like optional page cache readahead) and so an OOM killer - * invocation might not even be necessary. + * cgroup v1 allowes sync users space handling so we cannot afford + * to get stuck here for that configuration. That's why we don't do + * anything here except remember the OOM context and then deal with + * it at the end of the page fault when the stack is unwound, the + * locks are released, and when we know whether the fault was overall + * successful. * - * That's why we don't do anything here except remember the - * OOM context and then deal with it at the end of the page - * fault when the stack is unwound, the locks are released, - * and when we know whether the fault was overall successful. + * On the other hand, in-kernel OOM killer allows for an async victim + * memory reclaim (oom_reaper) and that means that we are not solely + * relying on the oom victim to make a forward progress so we can stay + * in the the try_charge context and keep retrying as long as there + * are oom victims to select. */ - css_get(&memcg->css); - current->memcg_in_oom = memcg; - current->memcg_oom_gfp_mask = mask; - current->memcg_oom_order = order; + if (memcg->oom_kill_disable) { + if (!current->memcg_may_oom) + return false; + css_get(&memcg->css); + current->memcg_in_oom = memcg; + current->memcg_oom_gfp_mask = mask; + current->memcg_oom_order = order; + + return false; + } + + return mem_cgroup_out_of_memory(memcg, mask, order); } /** @@ -2007,8 +2017,11 @@ static int try_charge(struct mem_cgroup *memcg, gfp_t gfp_mask, mem_cgroup_event(mem_over_limit, MEMCG_OOM); - mem_cgroup_oom(mem_over_limit, gfp_mask, - get_order(nr_pages * PAGE_SIZE)); + if (mem_cgroup_oom(mem_over_limit, gfp_mask, + get_order(nr_pages * PAGE_SIZE))) { + nr_retries = MEM_CGROUP_RECLAIM_RETRIES; + goto retry; + } nomem: if (!(gfp_mask & __GFP_NOFAIL)) return -ENOMEM; -- Michal Hocko SUSE Labs