From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752045AbdJXS7G (ORCPT ); Tue, 24 Oct 2017 14:59:06 -0400 Received: from gum.cmpxchg.org ([85.214.110.215]:44948 "EHLO gum.cmpxchg.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751591AbdJXS7F (ORCPT ); Tue, 24 Oct 2017 14:59:05 -0400 Date: Tue, 24 Oct 2017 14:58:54 -0400 From: Johannes Weiner To: Michal Hocko Cc: Greg Thelen , Shakeel Butt , Alexander Viro , Vladimir Davydov , Andrew Morton , Linux MM , linux-fsdevel@vger.kernel.org, LKML Subject: Re: [PATCH] fs, mm: account filp and names caches to kmemcg Message-ID: <20171024185854.GA6154@cmpxchg.org> References: <20171010141733.GB16710@cmpxchg.org> <20171010142434.bpiqmsbb7gttrlcb@dhcp22.suse.cz> <20171012190312.GA5075@cmpxchg.org> <20171013063555.pa7uco43mod7vrkn@dhcp22.suse.cz> <20171013070001.mglwdzdrqjt47clz@dhcp22.suse.cz> <20171013152421.yf76n7jui3z5bbn4@dhcp22.suse.cz> <20171024160637.GB32340@cmpxchg.org> <20171024162213.n6jrpz3t5pldkgxy@dhcp22.suse.cz> <20171024172330.GA3973@cmpxchg.org> <20171024175558.uxqtxwhjgu6ceadk@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20171024175558.uxqtxwhjgu6ceadk@dhcp22.suse.cz> User-Agent: Mutt/1.9.1 (2017-09-22) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Oct 24, 2017 at 07:55:58PM +0200, Michal Hocko wrote: > On Tue 24-10-17 13:23:30, Johannes Weiner wrote: > > On Tue, Oct 24, 2017 at 06:22:13PM +0200, Michal Hocko wrote: > [...] > > > What would prevent a runaway in case the only process in the memcg is > > > oom unkillable then? > > > > In such a scenario, the page fault handler would busy-loop right now. > > > > Disabling oom kills is a privileged operation with dire consequences > > if used incorrectly. You can panic the kernel with it. Why should the > > cgroup OOM killer implement protective semantics around this setting? > > Breaching the limit in such a setup is entirely acceptable. > > > > Really, I think it's an enormous mistake to start modeling semantics > > based on the most contrived and non-sensical edge case configurations. > > Start the discussion with what is sane and what most users should > > optimally experience, and keep the cornercases simple. > > I am not really seeing your concern about the semantic. The most > important property of the hard limit is to protect from runaways and > stop them if they happen. Users can use the softer variant (high limit) > if they are not afraid of those scenarios. It is not so insane to > imagine that a master task (which I can easily imagine would be oom > disabled) has a leak and runaway as a result. Then you're screwed either way. Where do you return -ENOMEM in a page fault path that cannot OOM kill anything? Your choice is between maintaining the hard limit semantics or going into an infinite loop. I fail to see how this setup has any impact on the semantics we pick here. And even if it were real, it's really not what most users do. > We are not talking only about the page fault path. There are other > allocation paths to consume a lot of memory and spill over and break > the isolation restriction. So it makes much more sense to me to fail > the allocation in such a situation rather than allow the runaway to > continue. Just consider that such a situation shouldn't happen in > the first place because there should always be an eligible task to > kill - who would own all the memory otherwise? Okay, then let's just stick to the current behavior.