From: Johannes Weiner <hannes@cmpxchg.org> To: Michal Hocko <mhocko@kernel.org> Cc: Greg Thelen <gthelen@google.com>, Shakeel Butt <shakeelb@google.com>, Alexander Viro <viro@zeniv.linux.org.uk>, Vladimir Davydov <vdavydov.dev@gmail.com>, Andrew Morton <akpm@linux-foundation.org>, Linux MM <linux-mm@kvack.org>, linux-fsdevel@vger.kernel.org, LKML <linux-kernel@vger.kernel.org> Subject: Re: [PATCH] fs, mm: account filp and names caches to kmemcg Date: Tue, 24 Oct 2017 13:23:30 -0400 [thread overview] Message-ID: <20171024172330.GA3973@cmpxchg.org> (raw) In-Reply-To: <20171024162213.n6jrpz3t5pldkgxy@dhcp22.suse.cz> On Tue, Oct 24, 2017 at 06:22:13PM +0200, Michal Hocko wrote: > On Tue 24-10-17 12:06:37, Johannes Weiner wrote: > > > * > > > - * That's why we don't do anything here except remember the > > > - * OOM context and then deal with it at the end of the page > > > - * fault when the stack is unwound, the locks are released, > > > - * and when we know whether the fault was overall successful. > > > + * Please note that mem_cgroup_oom_synchronize might fail to find a > > > + * victim and then we have rely on mem_cgroup_oom_synchronize otherwise > > > + * we would fall back to the global oom killer in pagefault_out_of_memory > > > > Ah, that's why... Ugh, that's really duct-tapey. > > As you know, I really hate the #PF OOM path. We should get rid of it. I agree, but this isn't getting rid of it, it just adds more layers. > > > @@ -2007,8 +2021,11 @@ static int try_charge(struct mem_cgroup *memcg, gfp_t gfp_mask, > > > > > > mem_cgroup_event(mem_over_limit, MEMCG_OOM); > > > > > > - mem_cgroup_oom(mem_over_limit, gfp_mask, > > > - get_order(nr_pages * PAGE_SIZE)); > > > + if (mem_cgroup_oom(mem_over_limit, gfp_mask, > > > + get_order(nr_pages * PAGE_SIZE))) { > > > + nr_retries = MEM_CGROUP_RECLAIM_RETRIES; > > > + goto retry; > > > + } > > > > As per the previous email, this has to goto force, otherwise we return > > -ENOMEM from syscalls once in a blue moon, which makes verification an > > absolute nightmare. The behavior should be reliable, without weird p99 > > corner cases. > > > > I think what we should be doing here is: if a charge fails, set up an > > oom context and force the charge; add mem_cgroup_oom_synchronize() to > > the end of syscalls and kernel-context faults. > > What would prevent a runaway in case the only process in the memcg is > oom unkillable then? In such a scenario, the page fault handler would busy-loop right now. Disabling oom kills is a privileged operation with dire consequences if used incorrectly. You can panic the kernel with it. Why should the cgroup OOM killer implement protective semantics around this setting? Breaching the limit in such a setup is entirely acceptable. Really, I think it's an enormous mistake to start modeling semantics based on the most contrived and non-sensical edge case configurations. Start the discussion with what is sane and what most users should optimally experience, and keep the cornercases simple.
WARNING: multiple messages have this Message-ID (diff)
From: Johannes Weiner <hannes@cmpxchg.org> To: Michal Hocko <mhocko@kernel.org> Cc: Greg Thelen <gthelen@google.com>, Shakeel Butt <shakeelb@google.com>, Alexander Viro <viro@zeniv.linux.org.uk>, Vladimir Davydov <vdavydov.dev@gmail.com>, Andrew Morton <akpm@linux-foundation.org>, Linux MM <linux-mm@kvack.org>, linux-fsdevel@vger.kernel.org, LKML <linux-kernel@vger.kernel.org> Subject: Re: [PATCH] fs, mm: account filp and names caches to kmemcg Date: Tue, 24 Oct 2017 13:23:30 -0400 [thread overview] Message-ID: <20171024172330.GA3973@cmpxchg.org> (raw) In-Reply-To: <20171024162213.n6jrpz3t5pldkgxy@dhcp22.suse.cz> On Tue, Oct 24, 2017 at 06:22:13PM +0200, Michal Hocko wrote: > On Tue 24-10-17 12:06:37, Johannes Weiner wrote: > > > * > > > - * That's why we don't do anything here except remember the > > > - * OOM context and then deal with it at the end of the page > > > - * fault when the stack is unwound, the locks are released, > > > - * and when we know whether the fault was overall successful. > > > + * Please note that mem_cgroup_oom_synchronize might fail to find a > > > + * victim and then we have rely on mem_cgroup_oom_synchronize otherwise > > > + * we would fall back to the global oom killer in pagefault_out_of_memory > > > > Ah, that's why... Ugh, that's really duct-tapey. > > As you know, I really hate the #PF OOM path. We should get rid of it. I agree, but this isn't getting rid of it, it just adds more layers. > > > @@ -2007,8 +2021,11 @@ static int try_charge(struct mem_cgroup *memcg, gfp_t gfp_mask, > > > > > > mem_cgroup_event(mem_over_limit, MEMCG_OOM); > > > > > > - mem_cgroup_oom(mem_over_limit, gfp_mask, > > > - get_order(nr_pages * PAGE_SIZE)); > > > + if (mem_cgroup_oom(mem_over_limit, gfp_mask, > > > + get_order(nr_pages * PAGE_SIZE))) { > > > + nr_retries = MEM_CGROUP_RECLAIM_RETRIES; > > > + goto retry; > > > + } > > > > As per the previous email, this has to goto force, otherwise we return > > -ENOMEM from syscalls once in a blue moon, which makes verification an > > absolute nightmare. The behavior should be reliable, without weird p99 > > corner cases. > > > > I think what we should be doing here is: if a charge fails, set up an > > oom context and force the charge; add mem_cgroup_oom_synchronize() to > > the end of syscalls and kernel-context faults. > > What would prevent a runaway in case the only process in the memcg is > oom unkillable then? In such a scenario, the page fault handler would busy-loop right now. Disabling oom kills is a privileged operation with dire consequences if used incorrectly. You can panic the kernel with it. Why should the cgroup OOM killer implement protective semantics around this setting? Breaching the limit in such a setup is entirely acceptable. Really, I think it's an enormous mistake to start modeling semantics based on the most contrived and non-sensical edge case configurations. Start the discussion with what is sane and what most users should optimally experience, and keep the cornercases simple. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2017-10-24 17:23 UTC|newest] Thread overview: 104+ messages / expand[flat|nested] mbox.gz Atom feed top 2017-10-05 22:21 [PATCH] fs, mm: account filp and names caches to kmemcg Shakeel Butt 2017-10-05 22:21 ` Shakeel Butt 2017-10-06 7:59 ` Michal Hocko 2017-10-06 7:59 ` Michal Hocko 2017-10-06 19:33 ` Shakeel Butt 2017-10-06 19:33 ` Shakeel Butt 2017-10-09 6:24 ` Michal Hocko 2017-10-09 6:24 ` Michal Hocko 2017-10-09 17:52 ` Greg Thelen 2017-10-09 17:52 ` Greg Thelen 2017-10-09 18:04 ` Michal Hocko 2017-10-09 18:04 ` Michal Hocko 2017-10-09 18:17 ` Michal Hocko 2017-10-09 18:17 ` Michal Hocko 2017-10-10 9:10 ` Michal Hocko 2017-10-10 9:10 ` Michal Hocko 2017-10-10 22:21 ` Shakeel Butt 2017-10-10 22:21 ` Shakeel Butt 2017-10-11 9:09 ` Michal Hocko 2017-10-11 9:09 ` Michal Hocko 2017-10-09 20:26 ` Johannes Weiner 2017-10-09 20:26 ` Johannes Weiner 2017-10-10 9:14 ` Michal Hocko 2017-10-10 9:14 ` Michal Hocko 2017-10-10 14:17 ` Johannes Weiner 2017-10-10 14:17 ` Johannes Weiner 2017-10-10 14:24 ` Michal Hocko 2017-10-10 14:24 ` Michal Hocko 2017-10-12 19:03 ` Johannes Weiner 2017-10-12 19:03 ` Johannes Weiner 2017-10-12 23:57 ` Greg Thelen 2017-10-12 23:57 ` Greg Thelen 2017-10-13 6:51 ` Michal Hocko 2017-10-13 6:51 ` Michal Hocko 2017-10-13 6:35 ` Michal Hocko 2017-10-13 6:35 ` Michal Hocko 2017-10-13 7:00 ` Michal Hocko 2017-10-13 7:00 ` Michal Hocko 2017-10-13 15:24 ` Michal Hocko 2017-10-13 15:24 ` Michal Hocko 2017-10-24 12:18 ` Michal Hocko 2017-10-24 12:18 ` Michal Hocko 2017-10-24 17:54 ` Johannes Weiner 2017-10-24 17:54 ` Johannes Weiner 2017-10-24 16:06 ` Johannes Weiner 2017-10-24 16:06 ` Johannes Weiner 2017-10-24 16:22 ` Michal Hocko 2017-10-24 16:22 ` Michal Hocko 2017-10-24 17:23 ` Johannes Weiner [this message] 2017-10-24 17:23 ` Johannes Weiner 2017-10-24 17:55 ` Michal Hocko 2017-10-24 17:55 ` Michal Hocko 2017-10-24 18:58 ` Johannes Weiner 2017-10-24 18:58 ` Johannes Weiner 2017-10-24 20:15 ` Michal Hocko 2017-10-24 20:15 ` Michal Hocko 2017-10-25 6:51 ` Greg Thelen 2017-10-25 6:51 ` Greg Thelen 2017-10-25 7:15 ` Michal Hocko 2017-10-25 7:15 ` Michal Hocko 2017-10-25 13:11 ` Johannes Weiner 2017-10-25 13:11 ` Johannes Weiner 2017-10-25 14:12 ` Michal Hocko 2017-10-25 14:12 ` Michal Hocko 2017-10-25 16:44 ` Johannes Weiner 2017-10-25 16:44 ` Johannes Weiner 2017-10-25 17:29 ` Michal Hocko 2017-10-25 17:29 ` Michal Hocko 2017-10-25 18:11 ` Johannes Weiner 2017-10-25 18:11 ` Johannes Weiner 2017-10-25 19:00 ` Michal Hocko 2017-10-25 19:00 ` Michal Hocko 2017-10-25 21:13 ` Johannes Weiner 2017-10-25 21:13 ` Johannes Weiner 2017-10-25 22:49 ` Greg Thelen 2017-10-25 22:49 ` Greg Thelen 2017-10-26 7:49 ` Michal Hocko 2017-10-26 7:49 ` Michal Hocko 2017-10-26 12:45 ` Tetsuo Handa 2017-10-26 12:45 ` Tetsuo Handa 2017-10-26 14:31 ` Johannes Weiner 2017-10-26 14:31 ` Johannes Weiner 2017-10-26 19:56 ` Greg Thelen 2017-10-26 19:56 ` Greg Thelen 2017-10-27 8:20 ` Michal Hocko 2017-10-27 8:20 ` Michal Hocko 2017-10-27 20:50 ` Shakeel Butt 2017-10-27 20:50 ` Shakeel Butt 2017-10-30 8:29 ` Michal Hocko 2017-10-30 8:29 ` Michal Hocko 2017-10-30 19:28 ` Shakeel Butt 2017-10-30 19:28 ` Shakeel Butt 2017-10-31 8:00 ` Michal Hocko 2017-10-31 8:00 ` Michal Hocko 2017-10-31 16:49 ` Johannes Weiner 2017-10-31 16:49 ` Johannes Weiner 2017-10-31 18:50 ` Michal Hocko 2017-10-31 18:50 ` Michal Hocko 2017-10-24 15:45 ` Johannes Weiner 2017-10-24 15:45 ` Johannes Weiner 2017-10-24 16:30 ` Michal Hocko 2017-10-24 16:30 ` Michal Hocko 2017-10-10 23:32 ` Al Viro 2017-10-10 23:32 ` Al Viro
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20171024172330.GA3973@cmpxchg.org \ --to=hannes@cmpxchg.org \ --cc=akpm@linux-foundation.org \ --cc=gthelen@google.com \ --cc=linux-fsdevel@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=mhocko@kernel.org \ --cc=shakeelb@google.com \ --cc=vdavydov.dev@gmail.com \ --cc=viro@zeniv.linux.org.uk \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.