From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752045AbdJXMTG (ORCPT ); Tue, 24 Oct 2017 08:19:06 -0400 Received: from mx2.suse.de ([195.135.220.15]:47280 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751958AbdJXMTC (ORCPT ); Tue, 24 Oct 2017 08:19:02 -0400 Date: Tue, 24 Oct 2017 14:18:59 +0200 From: Michal Hocko To: Greg Thelen , Johannes Weiner Cc: Shakeel Butt , Alexander Viro , Vladimir Davydov , Andrew Morton , Linux MM , linux-fsdevel@vger.kernel.org, LKML Subject: Re: [PATCH] fs, mm: account filp and names caches to kmemcg Message-ID: <20171024121859.4zd3zaafnjnlem4i@dhcp22.suse.cz> References: <20171009062426.hmqedtqz5hkmhnff@dhcp22.suse.cz> <20171009202613.GA15027@cmpxchg.org> <20171010091430.giflzlayvjblx5bu@dhcp22.suse.cz> <20171010141733.GB16710@cmpxchg.org> <20171010142434.bpiqmsbb7gttrlcb@dhcp22.suse.cz> <20171012190312.GA5075@cmpxchg.org> <20171013063555.pa7uco43mod7vrkn@dhcp22.suse.cz> <20171013070001.mglwdzdrqjt47clz@dhcp22.suse.cz> <20171013152421.yf76n7jui3z5bbn4@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20171013152421.yf76n7jui3z5bbn4@dhcp22.suse.cz> User-Agent: NeoMutt/20170609 (1.8.3) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Does this sound something that you would be interested in? I can spend som more time on it if it is worthwhile. On Fri 13-10-17 17:24:21, Michal Hocko wrote: > Well, it actually occured to me that this would trigger the global oom > killer in case no memcg specific victim can be found which is definitely > not something we would like to do. This should work better. I am not > sure we can trigger this corner case but we should cover it and it > actually doesn't make the code much worse. > --- > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index d5f3a62887cf..7b370f070b82 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -1528,26 +1528,40 @@ static void memcg_oom_recover(struct mem_cgroup *memcg) > > static void mem_cgroup_oom(struct mem_cgroup *memcg, gfp_t mask, int order) > { > - if (!current->memcg_may_oom) > - return; > /* > * We are in the middle of the charge context here, so we > * don't want to block when potentially sitting on a callstack > * that holds all kinds of filesystem and mm locks. > * > - * Also, the caller may handle a failed allocation gracefully > - * (like optional page cache readahead) and so an OOM killer > - * invocation might not even be necessary. > + * cgroup v1 allowes sync users space handling so we cannot afford > + * to get stuck here for that configuration. That's why we don't do > + * anything here except remember the OOM context and then deal with > + * it at the end of the page fault when the stack is unwound, the > + * locks are released, and when we know whether the fault was overall > + * successful. > + * > + * On the other hand, in-kernel OOM killer allows for an async victim > + * memory reclaim (oom_reaper) and that means that we are not solely > + * relying on the oom victim to make a forward progress so we can stay > + * in the the try_charge context and keep retrying as long as there > + * are oom victims to select. > * > - * That's why we don't do anything here except remember the > - * OOM context and then deal with it at the end of the page > - * fault when the stack is unwound, the locks are released, > - * and when we know whether the fault was overall successful. > + * Please note that mem_cgroup_oom_synchronize might fail to find a > + * victim and then we have rely on mem_cgroup_oom_synchronize otherwise > + * we would fall back to the global oom killer in pagefault_out_of_memory > */ > + if (!memcg->oom_kill_disable && > + mem_cgroup_out_of_memory(memcg, mask, order)) > + return true; > + > + if (!current->memcg_may_oom) > + return false; > css_get(&memcg->css); > current->memcg_in_oom = memcg; > current->memcg_oom_gfp_mask = mask; > current->memcg_oom_order = order; > + > + return false; > } > > /** > @@ -2007,8 +2021,11 @@ static int try_charge(struct mem_cgroup *memcg, gfp_t gfp_mask, > > mem_cgroup_event(mem_over_limit, MEMCG_OOM); > > - mem_cgroup_oom(mem_over_limit, gfp_mask, > - get_order(nr_pages * PAGE_SIZE)); > + if (mem_cgroup_oom(mem_over_limit, gfp_mask, > + get_order(nr_pages * PAGE_SIZE))) { > + nr_retries = MEM_CGROUP_RECLAIM_RETRIES; > + goto retry; > + } > nomem: > if (!(gfp_mask & __GFP_NOFAIL)) > return -ENOMEM; > -- > Michal Hocko > SUSE Labs -- Michal Hocko SUSE Labs From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Tue, 24 Oct 2017 14:18:59 +0200 From: Michal Hocko To: Greg Thelen , Johannes Weiner Cc: Shakeel Butt , Alexander Viro , Vladimir Davydov , Andrew Morton , Linux MM , linux-fsdevel@vger.kernel.org, LKML Subject: Re: [PATCH] fs, mm: account filp and names caches to kmemcg Message-ID: <20171024121859.4zd3zaafnjnlem4i@dhcp22.suse.cz> References: <20171009062426.hmqedtqz5hkmhnff@dhcp22.suse.cz> <20171009202613.GA15027@cmpxchg.org> <20171010091430.giflzlayvjblx5bu@dhcp22.suse.cz> <20171010141733.GB16710@cmpxchg.org> <20171010142434.bpiqmsbb7gttrlcb@dhcp22.suse.cz> <20171012190312.GA5075@cmpxchg.org> <20171013063555.pa7uco43mod7vrkn@dhcp22.suse.cz> <20171013070001.mglwdzdrqjt47clz@dhcp22.suse.cz> <20171013152421.yf76n7jui3z5bbn4@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20171013152421.yf76n7jui3z5bbn4@dhcp22.suse.cz> Sender: owner-linux-mm@kvack.org List-ID: Does this sound something that you would be interested in? I can spend som more time on it if it is worthwhile. On Fri 13-10-17 17:24:21, Michal Hocko wrote: > Well, it actually occured to me that this would trigger the global oom > killer in case no memcg specific victim can be found which is definitely > not something we would like to do. This should work better. I am not > sure we can trigger this corner case but we should cover it and it > actually doesn't make the code much worse. > --- > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index d5f3a62887cf..7b370f070b82 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -1528,26 +1528,40 @@ static void memcg_oom_recover(struct mem_cgroup *memcg) > > static void mem_cgroup_oom(struct mem_cgroup *memcg, gfp_t mask, int order) > { > - if (!current->memcg_may_oom) > - return; > /* > * We are in the middle of the charge context here, so we > * don't want to block when potentially sitting on a callstack > * that holds all kinds of filesystem and mm locks. > * > - * Also, the caller may handle a failed allocation gracefully > - * (like optional page cache readahead) and so an OOM killer > - * invocation might not even be necessary. > + * cgroup v1 allowes sync users space handling so we cannot afford > + * to get stuck here for that configuration. That's why we don't do > + * anything here except remember the OOM context and then deal with > + * it at the end of the page fault when the stack is unwound, the > + * locks are released, and when we know whether the fault was overall > + * successful. > + * > + * On the other hand, in-kernel OOM killer allows for an async victim > + * memory reclaim (oom_reaper) and that means that we are not solely > + * relying on the oom victim to make a forward progress so we can stay > + * in the the try_charge context and keep retrying as long as there > + * are oom victims to select. > * > - * That's why we don't do anything here except remember the > - * OOM context and then deal with it at the end of the page > - * fault when the stack is unwound, the locks are released, > - * and when we know whether the fault was overall successful. > + * Please note that mem_cgroup_oom_synchronize might fail to find a > + * victim and then we have rely on mem_cgroup_oom_synchronize otherwise > + * we would fall back to the global oom killer in pagefault_out_of_memory > */ > + if (!memcg->oom_kill_disable && > + mem_cgroup_out_of_memory(memcg, mask, order)) > + return true; > + > + if (!current->memcg_may_oom) > + return false; > css_get(&memcg->css); > current->memcg_in_oom = memcg; > current->memcg_oom_gfp_mask = mask; > current->memcg_oom_order = order; > + > + return false; > } > > /** > @@ -2007,8 +2021,11 @@ static int try_charge(struct mem_cgroup *memcg, gfp_t gfp_mask, > > mem_cgroup_event(mem_over_limit, MEMCG_OOM); > > - mem_cgroup_oom(mem_over_limit, gfp_mask, > - get_order(nr_pages * PAGE_SIZE)); > + if (mem_cgroup_oom(mem_over_limit, gfp_mask, > + get_order(nr_pages * PAGE_SIZE))) { > + nr_retries = MEM_CGROUP_RECLAIM_RETRIES; > + goto retry; > + } > nomem: > if (!(gfp_mask & __GFP_NOFAIL)) > return -ENOMEM; > -- > Michal Hocko > SUSE Labs -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org