All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Shakeel Butt <shakeelb@google.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Vladimir Davydov <vdavydov.dev@gmail.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Linux MM <linux-mm@kvack.org>,
	Andrey Ryabinin <aryabinin@virtuozzo.com>,
	Thomas Lindroth <thomas.lindroth@gmail.com>,
	Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Subject: Re: [PATCH] memcg, kmem: do not fail __GFP_NOFAIL charges
Date: Wed, 11 Sep 2019 17:16:12 +0200	[thread overview]
Message-ID: <20190911151612.GI4023@dhcp22.suse.cz> (raw)
In-Reply-To: <20190911073740.b5c40cd47ea845884e25e265@linux-foundation.org>

On Wed 11-09-19 07:37:40, Andrew Morton wrote:
> On Wed, 11 Sep 2019 14:00:02 +0200 Michal Hocko <mhocko@kernel.org> wrote:
> 
> > On Mon 09-09-19 13:22:45, Michal Hocko wrote:
> > > On Fri 06-09-19 11:24:55, Shakeel Butt wrote:
> > [...]
> > > > I wonder what has changed since
> > > > <http://lkml.kernel.org/r/20180525185501.82098-1-shakeelb@google.com/>.
> > > 
> > > I have completely forgot about that one. It seems that we have just
> > > repeated the same discussion again. This time we have a poor user who
> > > actually enabled the kmem limit.
> > > 
> > > I guess there was no real objection to the change back then. The primary
> > > discussion revolved around the fact that the accounting will stay broken
> > > even when this particular part was fixed. Considering this leads to easy
> > > to trigger crash (with the limit enabled) then I guess we should just
> > > make it less broken and backport to stable trees and have a serious
> > > discussion about discontinuing of the limit. Start by simply failing to
> > > set any limit in the current upstream kernels.
> > 
> > Any more concerns/objections to the patch? I can add a reference to your
> > earlier post Shakeel if you want or to credit you the way you prefer.
> > 
> > Also are there any objections to start deprecating process of kmem
> > limit? I would see it in two stages
> > - 1st warn in the kernel log
> > 	pr_warn("kmem.limit_in_bytes is deprecated and will be removed.
> > 	        "Please report your usecase to linux-mm@kvack.org if you "
> > 		"depend on this functionality."
> 
> pr_warn_once() :)
> 
> > - 2nd fail any write to kmem.limit_in_bytes
> > - 3rd remove the control file completely
> 
> Sounds good to me.

Here we go

From 512822e551fe2960040c23b12c7b27a5fdab9013 Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@suse.com>
Date: Wed, 11 Sep 2019 17:02:33 +0200
Subject: [PATCH] memcg, kmem: deprecate kmem.limit_in_bytes

Cgroup v1 memcg controller has exposed a dedicated kmem limit to users
which turned out to be really a bad idea because there are paths which
cannot shrink the kernel memory usage enough to get below the limit
(e.g. because the accounted memory is not reclaimable). There are cases
when the failure is even not allowed (e.g. __GFP_NOFAIL). This means
that the kmem limit is in excess to the hard limit without any way to
shrink and thus completely useless. OOM killer cannot be invoked to
handle the situation because that would lead to a premature oom killing.

As a result many places might see ENOMEM returning from kmalloc and
result in unexpected errors. E.g. a global OOM killer when there is a
lot of free memory because ENOMEM is translated into VM_FAULT_OOM in #PF
path and therefore pagefault_out_of_memory would result in OOM killer.

Please note that the kernel memory is still accounted to the overall
limit along with the user memory so removing the kmem specific limit
should still allow to contain kernel memory consumption. Unlike the kmem
one, though, it invokes memory reclaim and targeted memcg oom killing if
necessary.

Start the deprecation process by crying to the kernel log. Let's see
whether there are relevant usecases and simply return to EINVAL in the
second stage if nobody complains in few releases.

Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 Documentation/admin-guide/cgroup-v1/memory.rst | 3 +++
 mm/memcontrol.c                                | 3 +++
 2 files changed, 6 insertions(+)

diff --git a/Documentation/admin-guide/cgroup-v1/memory.rst b/Documentation/admin-guide/cgroup-v1/memory.rst
index 41bdc038dad9..e53fc2f31549 100644
--- a/Documentation/admin-guide/cgroup-v1/memory.rst
+++ b/Documentation/admin-guide/cgroup-v1/memory.rst
@@ -87,6 +87,9 @@ Brief summary of control files.
 				     node
 
  memory.kmem.limit_in_bytes          set/show hard limit for kernel memory
+                                     This knob is deprecated it shouldn't be
+                                     used. It is planned to be removed in
+                                     a foreseeable future.
  memory.kmem.usage_in_bytes          show current kernel memory allocation
  memory.kmem.failcnt                 show the number of kernel memory usage
 				     hits limits
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index e18108b2b786..113969bc57e8 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3518,6 +3518,9 @@ static ssize_t mem_cgroup_write(struct kernfs_open_file *of,
 			ret = mem_cgroup_resize_max(memcg, nr_pages, true);
 			break;
 		case _KMEM:
+			pr_warn_once("kmem.limit_in_bytes is deprecated and will be removed. "
+				     "Please report your usecase to linux-mm@kvack.org if you "
+				     "depend on this functionality.\n");
 			ret = memcg_update_kmem_max(memcg, nr_pages);
 			break;
 		case _TCP:
-- 
2.20.1


-- 
Michal Hocko
SUSE Labs

  reply	other threads:[~2019-09-11 15:16 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-01 20:43 [BUG] Early OOM and kernel NULL pointer dereference in 4.19.69 Thomas Lindroth
2019-09-02  7:16 ` Michal Hocko
2019-09-02  7:27   ` Michal Hocko
2019-09-02 19:34   ` Thomas Lindroth
2019-09-03  7:41     ` Michal Hocko
2019-09-03 12:01       ` Thomas Lindroth
2019-09-03 12:05       ` Andrey Ryabinin
2019-09-03 12:22         ` Michal Hocko
2019-09-03 18:20           ` Thomas Lindroth
2019-09-03 19:36             ` Michal Hocko
     [not found] ` <666dbcde-1b8a-9e2d-7d1f-48a117c78ae1@I-love.SAKURA.ne.jp>
2019-09-03 18:25   ` Thomas Lindroth
     [not found]     ` <4d0eda9a-319d-1a7d-1eed-71da90902367@i-love.sakura.ne.jp>
2019-09-04 11:25       ` [BUG] kmemcg limit defeats __GFP_NOFAIL allocation Michal Hocko
     [not found]         ` <4d87d770-c110-224f-6c0c-d6fada90417d@i-love.sakura.ne.jp>
2019-09-04 11:59           ` Michal Hocko
     [not found]         ` <0056063b-46ff-0ebd-ff0d-c96a1f9ae6b1@i-love.sakura.ne.jp>
2019-09-04 14:29           ` Michal Hocko
     [not found]             ` <405ce28b-c0b4-780c-c883-42d741ec60e0@i-love.sakura.ne.jp>
2019-09-05 23:11               ` Thomas Lindroth
2019-09-06  7:27                 ` Michal Hocko
2019-09-06 10:54                   ` Andrey Ryabinin
2019-09-06 11:29                     ` Michal Hocko
2019-09-06 12:56 ` [PATCH] memcg, kmem: do not fail __GFP_NOFAIL charges Michal Hocko
2019-09-06 18:24   ` Shakeel Butt
2019-09-06 18:24     ` Shakeel Butt
2019-09-09 11:22     ` Michal Hocko
2019-09-11 12:00       ` Michal Hocko
2019-09-11 14:37         ` Andrew Morton
2019-09-11 15:16           ` Michal Hocko [this message]
2019-09-13  2:46             ` Shakeel Butt
2019-09-13  2:46               ` Shakeel Butt
2019-09-24 10:53   ` Michal Hocko
2019-09-24 23:06     ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190911151612.GI4023@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=aryabinin@virtuozzo.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=penguin-kernel@i-love.sakura.ne.jp \
    --cc=shakeelb@google.com \
    --cc=thomas.lindroth@gmail.com \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.