From: Shakeel Butt <shakeelb@google.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Johannes Weiner <hannes@cmpxchg.org>,
Vladimir Davydov <vdavydov.dev@gmail.com>,
LKML <linux-kernel@vger.kernel.org>,
Linux MM <linux-mm@kvack.org>,
Andrey Ryabinin <aryabinin@virtuozzo.com>,
Thomas Lindroth <thomas.lindroth@gmail.com>,
Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Subject: Re: [PATCH] memcg, kmem: do not fail __GFP_NOFAIL charges
Date: Thu, 12 Sep 2019 19:46:04 -0700 [thread overview]
Message-ID: <CALvZod65jCCH+fHqAQwk0RTZhyhxG71F-sHE7qxrmZ_L1tDbvw@mail.gmail.com> (raw)
In-Reply-To: <20190911151612.GI4023@dhcp22.suse.cz>
On Wed, Sep 11, 2019 at 8:16 AM Michal Hocko <mhocko@kernel.org> wrote:
>
> On Wed 11-09-19 07:37:40, Andrew Morton wrote:
> > On Wed, 11 Sep 2019 14:00:02 +0200 Michal Hocko <mhocko@kernel.org> wrote:
> >
> > > On Mon 09-09-19 13:22:45, Michal Hocko wrote:
> > > > On Fri 06-09-19 11:24:55, Shakeel Butt wrote:
> > > [...]
> > > > > I wonder what has changed since
> > > > > <http://lkml.kernel.org/r/20180525185501.82098-1-shakeelb@google.com/>.
> > > >
> > > > I have completely forgot about that one. It seems that we have just
> > > > repeated the same discussion again. This time we have a poor user who
> > > > actually enabled the kmem limit.
> > > >
> > > > I guess there was no real objection to the change back then. The primary
> > > > discussion revolved around the fact that the accounting will stay broken
> > > > even when this particular part was fixed. Considering this leads to easy
> > > > to trigger crash (with the limit enabled) then I guess we should just
> > > > make it less broken and backport to stable trees and have a serious
> > > > discussion about discontinuing of the limit. Start by simply failing to
> > > > set any limit in the current upstream kernels.
> > >
> > > Any more concerns/objections to the patch? I can add a reference to your
> > > earlier post Shakeel if you want or to credit you the way you prefer.
> > >
> > > Also are there any objections to start deprecating process of kmem
> > > limit? I would see it in two stages
> > > - 1st warn in the kernel log
> > > pr_warn("kmem.limit_in_bytes is deprecated and will be removed.
> > > "Please report your usecase to linux-mm@kvack.org if you "
> > > "depend on this functionality."
> >
> > pr_warn_once() :)
> >
> > > - 2nd fail any write to kmem.limit_in_bytes
> > > - 3rd remove the control file completely
> >
> > Sounds good to me.
>
> Here we go
>
> From 512822e551fe2960040c23b12c7b27a5fdab9013 Mon Sep 17 00:00:00 2001
> From: Michal Hocko <mhocko@suse.com>
> Date: Wed, 11 Sep 2019 17:02:33 +0200
> Subject: [PATCH] memcg, kmem: deprecate kmem.limit_in_bytes
>
> Cgroup v1 memcg controller has exposed a dedicated kmem limit to users
> which turned out to be really a bad idea because there are paths which
> cannot shrink the kernel memory usage enough to get below the limit
> (e.g. because the accounted memory is not reclaimable). There are cases
> when the failure is even not allowed (e.g. __GFP_NOFAIL). This means
> that the kmem limit is in excess to the hard limit without any way to
> shrink and thus completely useless. OOM killer cannot be invoked to
> handle the situation because that would lead to a premature oom killing.
>
> As a result many places might see ENOMEM returning from kmalloc and
> result in unexpected errors. E.g. a global OOM killer when there is a
> lot of free memory because ENOMEM is translated into VM_FAULT_OOM in #PF
> path and therefore pagefault_out_of_memory would result in OOM killer.
>
> Please note that the kernel memory is still accounted to the overall
> limit along with the user memory so removing the kmem specific limit
> should still allow to contain kernel memory consumption. Unlike the kmem
> one, though, it invokes memory reclaim and targeted memcg oom killing if
> necessary.
>
> Start the deprecation process by crying to the kernel log. Let's see
> whether there are relevant usecases and simply return to EINVAL in the
> second stage if nobody complains in few releases.
>
> Signed-off-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
> ---
> Documentation/admin-guide/cgroup-v1/memory.rst | 3 +++
> mm/memcontrol.c | 3 +++
> 2 files changed, 6 insertions(+)
>
> diff --git a/Documentation/admin-guide/cgroup-v1/memory.rst b/Documentation/admin-guide/cgroup-v1/memory.rst
> index 41bdc038dad9..e53fc2f31549 100644
> --- a/Documentation/admin-guide/cgroup-v1/memory.rst
> +++ b/Documentation/admin-guide/cgroup-v1/memory.rst
> @@ -87,6 +87,9 @@ Brief summary of control files.
> node
>
> memory.kmem.limit_in_bytes set/show hard limit for kernel memory
> + This knob is deprecated it shouldn't be
> + used. It is planned to be removed in
> + a foreseeable future.
> memory.kmem.usage_in_bytes show current kernel memory allocation
> memory.kmem.failcnt show the number of kernel memory usage
> hits limits
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index e18108b2b786..113969bc57e8 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -3518,6 +3518,9 @@ static ssize_t mem_cgroup_write(struct kernfs_open_file *of,
> ret = mem_cgroup_resize_max(memcg, nr_pages, true);
> break;
> case _KMEM:
> + pr_warn_once("kmem.limit_in_bytes is deprecated and will be removed. "
> + "Please report your usecase to linux-mm@kvack.org if you "
> + "depend on this functionality.\n");
> ret = memcg_update_kmem_max(memcg, nr_pages);
> break;
> case _TCP:
> --
> 2.20.1
>
>
> --
> Michal Hocko
> SUSE Labs
next prev parent reply other threads:[~2019-09-13 2:46 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-09-01 20:43 [BUG] Early OOM and kernel NULL pointer dereference in 4.19.69 Thomas Lindroth
2019-09-02 7:16 ` Michal Hocko
2019-09-02 7:27 ` Michal Hocko
2019-09-02 19:34 ` Thomas Lindroth
2019-09-03 7:41 ` Michal Hocko
2019-09-03 12:01 ` Thomas Lindroth
2019-09-03 12:05 ` Andrey Ryabinin
2019-09-03 12:22 ` Michal Hocko
2019-09-03 18:20 ` Thomas Lindroth
2019-09-03 19:36 ` Michal Hocko
[not found] ` <666dbcde-1b8a-9e2d-7d1f-48a117c78ae1@I-love.SAKURA.ne.jp>
2019-09-03 18:25 ` Thomas Lindroth
[not found] ` <4d0eda9a-319d-1a7d-1eed-71da90902367@i-love.sakura.ne.jp>
2019-09-04 11:25 ` [BUG] kmemcg limit defeats __GFP_NOFAIL allocation Michal Hocko
[not found] ` <4d87d770-c110-224f-6c0c-d6fada90417d@i-love.sakura.ne.jp>
2019-09-04 11:59 ` Michal Hocko
[not found] ` <0056063b-46ff-0ebd-ff0d-c96a1f9ae6b1@i-love.sakura.ne.jp>
2019-09-04 14:29 ` Michal Hocko
[not found] ` <405ce28b-c0b4-780c-c883-42d741ec60e0@i-love.sakura.ne.jp>
2019-09-05 23:11 ` Thomas Lindroth
2019-09-06 7:27 ` Michal Hocko
2019-09-06 10:54 ` Andrey Ryabinin
2019-09-06 11:29 ` Michal Hocko
[not found] ` <20190906125608.32129-1-mhocko@kernel.org>
2019-09-06 18:24 ` [PATCH] memcg, kmem: do not fail __GFP_NOFAIL charges Shakeel Butt
2019-09-09 11:22 ` Michal Hocko
2019-09-11 12:00 ` Michal Hocko
2019-09-11 14:37 ` Andrew Morton
2019-09-11 15:16 ` Michal Hocko
2019-09-13 2:46 ` Shakeel Butt [this message]
2019-09-24 10:53 ` Michal Hocko
2019-09-24 23:06 ` Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CALvZod65jCCH+fHqAQwk0RTZhyhxG71F-sHE7qxrmZ_L1tDbvw@mail.gmail.com \
--to=shakeelb@google.com \
--cc=akpm@linux-foundation.org \
--cc=aryabinin@virtuozzo.com \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=penguin-kernel@i-love.sakura.ne.jp \
--cc=thomas.lindroth@gmail.com \
--cc=vdavydov.dev@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).