All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vasily Averin <vvs@virtuozzo.com>
To: Michal Hocko <mhocko@suse.com>, cgroups@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, "Roman Gushchin" <guro@fb.com>,
	"Christian Brauner" <christian.brauner@ubuntu.com>,
	"Michal Koutný" <mkoutny@suse.com>,
	"Serge Hallyn" <serge@hallyn.com>
Subject: [PATCH v2 1/1] memcg: enable accounting for pids in nested pid namespaces
Date: Sat, 24 Apr 2021 14:54:35 +0300	[thread overview]
Message-ID: <8b6de616-fd1a-02c6-cbdb-976ecdcfa604@virtuozzo.com> (raw)
In-Reply-To: <7b777e22-5b0d-7444-343d-92cbfae5f8b4@virtuozzo.com>

Commit 5d097056c9a0 ("kmemcg: account certain kmem allocations to memcg")
enabled memcg accounting for pids allocated from init_pid_ns.pid_cachep,
but forgot to adjust the setting for nested pid namespaces.
As a result, pid memory is not accounted exactly where it is really needed,
inside memcg-limited containers with their own pid namespaces.

Pid was one the first kernel objects enabled for memcg accounting.
init_pid_ns.pid_cachep marked by SLAB_ACCOUNT and we can expect that
any new pids in the system are memcg-accounted.

Though recently I've noticed that it is wrong. nested pid namespaces creates 
own slab caches for pid objects, nested pids have increased size because contain 
id both for all parent and for own pid namespaces. The problem is that these slab
caches are _NOT_ marked by SLAB_ACCOUNT, as a result any pids allocated in 
nested pid namespaces are not memcg-accounted.

Pid struct in nested pid namespace consumes up to 500 bytes memory, 
100000 such objects gives us up to ~50Mb unaccounted memory,
this allow container to exceed assigned memcg limits.

Fixes: 5d097056c9a0 ("kmemcg: account certain kmem allocations to memcg")
Cc: stable@vger.kernel.org
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Reviewed-by: Michal Koutný <mkoutny@suse.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Acked-by: Roman Gushchin <guro@fb.com>
---
 kernel/pid_namespace.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/pid_namespace.c b/kernel/pid_namespace.c
index 6cd6715..a46a372 100644
--- a/kernel/pid_namespace.c
+++ b/kernel/pid_namespace.c
@@ -51,7 +51,8 @@ static struct kmem_cache *create_pid_cachep(unsigned int level)
 	mutex_lock(&pid_caches_mutex);
 	/* Name collision forces to do allocation under mutex. */
 	if (!*pkc)
-		*pkc = kmem_cache_create(name, len, 0, SLAB_HWCACHE_ALIGN, 0);
+		*pkc = kmem_cache_create(name, len, 0,
+					 SLAB_HWCACHE_ALIGN | SLAB_ACCOUNT, 0);
 	mutex_unlock(&pid_caches_mutex);
 	/* current can fail, but someone else can succeed. */
 	return READ_ONCE(*pkc);
-- 
1.8.3.1


WARNING: multiple messages have this Message-ID (diff)
From: Vasily Averin <vvs-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>
To: Michal Hocko <mhocko-IBi9RG/b67k@public.gmane.org>,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	"Roman Gushchin" <guro-b10kYP2dOMg@public.gmane.org>,
	"Christian Brauner"
	<christian.brauner-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org>,
	"Michal Koutný" <mkoutny-IBi9RG/b67k@public.gmane.org>,
	"Serge Hallyn" <serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org>
Subject: [PATCH v2 1/1] memcg: enable accounting for pids in nested pid namespaces
Date: Sat, 24 Apr 2021 14:54:35 +0300	[thread overview]
Message-ID: <8b6de616-fd1a-02c6-cbdb-976ecdcfa604@virtuozzo.com> (raw)
In-Reply-To: <7b777e22-5b0d-7444-343d-92cbfae5f8b4-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>

Commit 5d097056c9a0 ("kmemcg: account certain kmem allocations to memcg")
enabled memcg accounting for pids allocated from init_pid_ns.pid_cachep,
but forgot to adjust the setting for nested pid namespaces.
As a result, pid memory is not accounted exactly where it is really needed,
inside memcg-limited containers with their own pid namespaces.

Pid was one the first kernel objects enabled for memcg accounting.
init_pid_ns.pid_cachep marked by SLAB_ACCOUNT and we can expect that
any new pids in the system are memcg-accounted.

Though recently I've noticed that it is wrong. nested pid namespaces creates 
own slab caches for pid objects, nested pids have increased size because contain 
id both for all parent and for own pid namespaces. The problem is that these slab
caches are _NOT_ marked by SLAB_ACCOUNT, as a result any pids allocated in 
nested pid namespaces are not memcg-accounted.

Pid struct in nested pid namespace consumes up to 500 bytes memory, 
100000 such objects gives us up to ~50Mb unaccounted memory,
this allow container to exceed assigned memcg limits.

Fixes: 5d097056c9a0 ("kmemcg: account certain kmem allocations to memcg")
Cc: stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Signed-off-by: Vasily Averin <vvs-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>
Reviewed-by: Michal Koutn√Ω <mkoutny-IBi9RG/b67k@public.gmane.org>
Acked-by: Christian Brauner <christian.brauner-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org>
Acked-by: Roman Gushchin <guro-b10kYP2dOMg@public.gmane.org>
---
 kernel/pid_namespace.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/pid_namespace.c b/kernel/pid_namespace.c
index 6cd6715..a46a372 100644
--- a/kernel/pid_namespace.c
+++ b/kernel/pid_namespace.c
@@ -51,7 +51,8 @@ static struct kmem_cache *create_pid_cachep(unsigned int level)
 	mutex_lock(&pid_caches_mutex);
 	/* Name collision forces to do allocation under mutex. */
 	if (!*pkc)
-		*pkc = kmem_cache_create(name, len, 0, SLAB_HWCACHE_ALIGN, 0);
+		*pkc = kmem_cache_create(name, len, 0,
+					 SLAB_HWCACHE_ALIGN | SLAB_ACCOUNT, 0);
 	mutex_unlock(&pid_caches_mutex);
 	/* current can fail, but someone else can succeed. */
 	return READ_ONCE(*pkc);
-- 
1.8.3.1


  parent reply	other threads:[~2021-04-24 11:54 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-22  5:44 [PATCH] memcg: enable accounting for pids in nested pid namespaces Vasily Averin
2021-04-24 11:54 ` [PATCH v2 0/1] " Vasily Averin
2021-04-24 11:54   ` Vasily Averin
2021-04-24 11:54 ` Vasily Averin [this message]
2021-04-24 11:54   ` [PATCH v2 1/1] " Vasily Averin
2021-04-26 19:39   ` Shakeel Butt
2021-04-26 19:39     ` Shakeel Butt
2021-07-14  6:31   ` Vasily Averin
2021-07-14  6:31     ` Vasily Averin
     [not found] ` <7b777e22-5b0d-7444-343d-92cbfae5f8b4-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>
2021-04-23  1:00   ` [PATCH] " Roman Gushchin
2021-04-23  2:09     ` Vasily Averin
     [not found]       ` <38945563-59ad-fb5e-9f7f-eb65ae4bf55e-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>
2021-04-23  2:30         ` Roman Gushchin
2021-04-23  2:53           ` Vasily Averin
     [not found]             ` <cd6680e3-edd0-88fa-bb83-b9f2d5a65d5b-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>
2021-04-23  7:34               ` Christian Brauner
2021-04-23 16:54   ` Michal Koutný
2021-07-14  7:43   ` Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8b6de616-fd1a-02c6-cbdb-976ecdcfa604@virtuozzo.com \
    --to=vvs@virtuozzo.com \
    --cc=cgroups@vger.kernel.org \
    --cc=christian.brauner@ubuntu.com \
    --cc=guro@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhocko@suse.com \
    --cc=mkoutny@suse.com \
    --cc=serge@hallyn.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.