All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH -mm] slab: use cgroup ino for naming per memcg caches
@ 2015-04-07 13:53 ` Vladimir Davydov
  0 siblings, 0 replies; 13+ messages in thread
From: Vladimir Davydov @ 2015-04-07 13:53 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Johannes Weiner, Michal Hocko, Christoph Lameter, Pekka Enberg,
	David Rientjes, Joonsoo Kim, linux-mm, cgroups, linux-kernel

The name of a per memcg kmem cache consists of three parts: the global
kmem cache name, the cgroup name, and the css id. The latter is used to
guarantee cache name uniqueness.

Since css ids are opaque to the userspace, in general it is impossible
to find a cache's owner cgroup given its name: there might be several
same-named cgroups with different parents so that their caches' names
will only differ by css id. Looking up the owner cgroup by a cache name,
however, could be useful for debugging. For instance, the cache name is
dumped to dmesg on a slab allocation failure. Another example is
/sys/kernel/slab, which exports some extra info/tunables for SLUB caches
referring to them by name.

This patch substitutes the css id with cgroup inode number, which, just
like css id, is reserved until css free, so that the cache names are
still guaranteed to be unique, but, in contrast to css id, it can be
easily obtained from userspace.

Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
---
 mm/slab_common.c |    9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/mm/slab_common.c b/mm/slab_common.c
index 999bb3424d44..e97bf3e04ed7 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -478,7 +478,7 @@ void memcg_create_kmem_cache(struct mem_cgroup *memcg,
 			     struct kmem_cache *root_cache)
 {
 	static char memcg_name_buf[NAME_MAX + 1]; /* protected by slab_mutex */
-	struct cgroup_subsys_state *css = mem_cgroup_css(memcg);
+	struct cgroup *cgroup;
 	struct memcg_cache_array *arr;
 	struct kmem_cache *s = NULL;
 	char *cache_name;
@@ -508,9 +508,10 @@ void memcg_create_kmem_cache(struct mem_cgroup *memcg,
 	if (arr->entries[idx])
 		goto out_unlock;
 
-	cgroup_name(css->cgroup, memcg_name_buf, sizeof(memcg_name_buf));
-	cache_name = kasprintf(GFP_KERNEL, "%s(%d:%s)", root_cache->name,
-			       css->id, memcg_name_buf);
+	cgroup = mem_cgroup_css(memcg)->cgroup;
+	cgroup_name(cgroup, memcg_name_buf, sizeof(memcg_name_buf));
+	cache_name = kasprintf(GFP_KERNEL, "%s(%lu:%s)", root_cache->name,
+			(unsigned long)cgroup_ino(cgroup), memcg_name_buf);
 	if (!cache_name)
 		goto out_unlock;
 
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH -mm] slab: use cgroup ino for naming per memcg caches
@ 2015-04-07 13:53 ` Vladimir Davydov
  0 siblings, 0 replies; 13+ messages in thread
From: Vladimir Davydov @ 2015-04-07 13:53 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Johannes Weiner, Michal Hocko, Christoph Lameter, Pekka Enberg,
	David Rientjes, Joonsoo Kim, linux-mm, cgroups, linux-kernel

The name of a per memcg kmem cache consists of three parts: the global
kmem cache name, the cgroup name, and the css id. The latter is used to
guarantee cache name uniqueness.

Since css ids are opaque to the userspace, in general it is impossible
to find a cache's owner cgroup given its name: there might be several
same-named cgroups with different parents so that their caches' names
will only differ by css id. Looking up the owner cgroup by a cache name,
however, could be useful for debugging. For instance, the cache name is
dumped to dmesg on a slab allocation failure. Another example is
/sys/kernel/slab, which exports some extra info/tunables for SLUB caches
referring to them by name.

This patch substitutes the css id with cgroup inode number, which, just
like css id, is reserved until css free, so that the cache names are
still guaranteed to be unique, but, in contrast to css id, it can be
easily obtained from userspace.

Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
---
 mm/slab_common.c |    9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/mm/slab_common.c b/mm/slab_common.c
index 999bb3424d44..e97bf3e04ed7 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -478,7 +478,7 @@ void memcg_create_kmem_cache(struct mem_cgroup *memcg,
 			     struct kmem_cache *root_cache)
 {
 	static char memcg_name_buf[NAME_MAX + 1]; /* protected by slab_mutex */
-	struct cgroup_subsys_state *css = mem_cgroup_css(memcg);
+	struct cgroup *cgroup;
 	struct memcg_cache_array *arr;
 	struct kmem_cache *s = NULL;
 	char *cache_name;
@@ -508,9 +508,10 @@ void memcg_create_kmem_cache(struct mem_cgroup *memcg,
 	if (arr->entries[idx])
 		goto out_unlock;
 
-	cgroup_name(css->cgroup, memcg_name_buf, sizeof(memcg_name_buf));
-	cache_name = kasprintf(GFP_KERNEL, "%s(%d:%s)", root_cache->name,
-			       css->id, memcg_name_buf);
+	cgroup = mem_cgroup_css(memcg)->cgroup;
+	cgroup_name(cgroup, memcg_name_buf, sizeof(memcg_name_buf));
+	cache_name = kasprintf(GFP_KERNEL, "%s(%lu:%s)", root_cache->name,
+			(unsigned long)cgroup_ino(cgroup), memcg_name_buf);
 	if (!cache_name)
 		goto out_unlock;
 
-- 
1.7.10.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH -mm] slab: use cgroup ino for naming per memcg caches
  2015-04-07 13:53 ` Vladimir Davydov
@ 2015-04-07 20:38   ` Andrew Morton
  -1 siblings, 0 replies; 13+ messages in thread
From: Andrew Morton @ 2015-04-07 20:38 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: Johannes Weiner, Michal Hocko, Christoph Lameter, Pekka Enberg,
	David Rientjes, Joonsoo Kim, linux-mm, cgroups, linux-kernel

On Tue, 7 Apr 2015 16:53:18 +0300 Vladimir Davydov <vdavydov@parallels.com> wrote:

> The name of a per memcg kmem cache consists of three parts: the global
> kmem cache name, the cgroup name, and the css id. The latter is used to
> guarantee cache name uniqueness.
> 
> Since css ids are opaque to the userspace, in general it is impossible
> to find a cache's owner cgroup given its name: there might be several
> same-named cgroups with different parents so that their caches' names
> will only differ by css id. Looking up the owner cgroup by a cache name,
> however, could be useful for debugging. For instance, the cache name is
> dumped to dmesg on a slab allocation failure. Another example is
> /sys/kernel/slab, which exports some extra info/tunables for SLUB caches

/proc/sys/kernel/slab?

> referring to them by name.
> 
> This patch substitutes the css id with cgroup inode number, which, just
> like css id, is reserved until css free, so that the cache names are
> still guaranteed to be unique, but, in contrast to css id, it can be
> easily obtained from userspace.
> 
> ...
>
> --- a/mm/slab_common.c
> +++ b/mm/slab_common.c
> @@ -478,7 +478,7 @@ void memcg_create_kmem_cache(struct mem_cgroup *memcg,
>  			     struct kmem_cache *root_cache)
>  {
>  	static char memcg_name_buf[NAME_MAX + 1]; /* protected by slab_mutex */
> -	struct cgroup_subsys_state *css = mem_cgroup_css(memcg);
> +	struct cgroup *cgroup;
>  	struct memcg_cache_array *arr;
>  	struct kmem_cache *s = NULL;
>  	char *cache_name;
> @@ -508,9 +508,10 @@ void memcg_create_kmem_cache(struct mem_cgroup *memcg,
>  	if (arr->entries[idx])
>  		goto out_unlock;
>  
> -	cgroup_name(css->cgroup, memcg_name_buf, sizeof(memcg_name_buf));
> -	cache_name = kasprintf(GFP_KERNEL, "%s(%d:%s)", root_cache->name,
> -			       css->id, memcg_name_buf);
> +	cgroup = mem_cgroup_css(memcg)->cgroup;
> +	cgroup_name(cgroup, memcg_name_buf, sizeof(memcg_name_buf));
> +	cache_name = kasprintf(GFP_KERNEL, "%s(%lu:%s)", root_cache->name,
> +			(unsigned long)cgroup_ino(cgroup), memcg_name_buf);
>  	if (!cache_name)
>  		goto out_unlock;

Is this interface documented anywhere?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH -mm] slab: use cgroup ino for naming per memcg caches
@ 2015-04-07 20:38   ` Andrew Morton
  0 siblings, 0 replies; 13+ messages in thread
From: Andrew Morton @ 2015-04-07 20:38 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: Johannes Weiner, Michal Hocko, Christoph Lameter, Pekka Enberg,
	David Rientjes, Joonsoo Kim, linux-mm, cgroups, linux-kernel

On Tue, 7 Apr 2015 16:53:18 +0300 Vladimir Davydov <vdavydov@parallels.com> wrote:

> The name of a per memcg kmem cache consists of three parts: the global
> kmem cache name, the cgroup name, and the css id. The latter is used to
> guarantee cache name uniqueness.
> 
> Since css ids are opaque to the userspace, in general it is impossible
> to find a cache's owner cgroup given its name: there might be several
> same-named cgroups with different parents so that their caches' names
> will only differ by css id. Looking up the owner cgroup by a cache name,
> however, could be useful for debugging. For instance, the cache name is
> dumped to dmesg on a slab allocation failure. Another example is
> /sys/kernel/slab, which exports some extra info/tunables for SLUB caches

/proc/sys/kernel/slab?

> referring to them by name.
> 
> This patch substitutes the css id with cgroup inode number, which, just
> like css id, is reserved until css free, so that the cache names are
> still guaranteed to be unique, but, in contrast to css id, it can be
> easily obtained from userspace.
> 
> ...
>
> --- a/mm/slab_common.c
> +++ b/mm/slab_common.c
> @@ -478,7 +478,7 @@ void memcg_create_kmem_cache(struct mem_cgroup *memcg,
>  			     struct kmem_cache *root_cache)
>  {
>  	static char memcg_name_buf[NAME_MAX + 1]; /* protected by slab_mutex */
> -	struct cgroup_subsys_state *css = mem_cgroup_css(memcg);
> +	struct cgroup *cgroup;
>  	struct memcg_cache_array *arr;
>  	struct kmem_cache *s = NULL;
>  	char *cache_name;
> @@ -508,9 +508,10 @@ void memcg_create_kmem_cache(struct mem_cgroup *memcg,
>  	if (arr->entries[idx])
>  		goto out_unlock;
>  
> -	cgroup_name(css->cgroup, memcg_name_buf, sizeof(memcg_name_buf));
> -	cache_name = kasprintf(GFP_KERNEL, "%s(%d:%s)", root_cache->name,
> -			       css->id, memcg_name_buf);
> +	cgroup = mem_cgroup_css(memcg)->cgroup;
> +	cgroup_name(cgroup, memcg_name_buf, sizeof(memcg_name_buf));
> +	cache_name = kasprintf(GFP_KERNEL, "%s(%lu:%s)", root_cache->name,
> +			(unsigned long)cgroup_ino(cgroup), memcg_name_buf);
>  	if (!cache_name)
>  		goto out_unlock;

Is this interface documented anywhere?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH -mm] slab: use cgroup ino for naming per memcg caches
  2015-04-07 20:38   ` Andrew Morton
  (?)
@ 2015-04-08  9:54     ` Vladimir Davydov
  -1 siblings, 0 replies; 13+ messages in thread
From: Vladimir Davydov @ 2015-04-08  9:54 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Johannes Weiner, Michal Hocko, Christoph Lameter, Pekka Enberg,
	David Rientjes, Joonsoo Kim, linux-mm, cgroups, linux-kernel

On Tue, Apr 07, 2015 at 01:38:19PM -0700, Andrew Morton wrote:
> On Tue, 7 Apr 2015 16:53:18 +0300 Vladimir Davydov <vdavydov@parallels.com> wrote:
> 
> > The name of a per memcg kmem cache consists of three parts: the global
> > kmem cache name, the cgroup name, and the css id. The latter is used to
> > guarantee cache name uniqueness.
> > 
> > Since css ids are opaque to the userspace, in general it is impossible
> > to find a cache's owner cgroup given its name: there might be several
> > same-named cgroups with different parents so that their caches' names
> > will only differ by css id. Looking up the owner cgroup by a cache name,
> > however, could be useful for debugging. For instance, the cache name is
> > dumped to dmesg on a slab allocation failure. Another example is
> > /sys/kernel/slab, which exports some extra info/tunables for SLUB caches
> 
> /proc/sys/kernel/slab?

No, /sys/kernel/slab/. There is a directory with tunables for each
global cache there (only for SLUB). If CONFIG_MEMCG_KMEM is on, there is
also /sys/kernel/slab/<slab-name>/cgroup/, which contains directories
with tunables for each per memcg cache.

> 
> > referring to them by name.
> > 
> > This patch substitutes the css id with cgroup inode number, which, just
> > like css id, is reserved until css free, so that the cache names are
> > still guaranteed to be unique, but, in contrast to css id, it can be
> > easily obtained from userspace.
> > 
> > ...
> >
> > --- a/mm/slab_common.c
> > +++ b/mm/slab_common.c
> > @@ -478,7 +478,7 @@ void memcg_create_kmem_cache(struct mem_cgroup *memcg,
> >  			     struct kmem_cache *root_cache)
> >  {
> >  	static char memcg_name_buf[NAME_MAX + 1]; /* protected by slab_mutex */
> > -	struct cgroup_subsys_state *css = mem_cgroup_css(memcg);
> > +	struct cgroup *cgroup;
> >  	struct memcg_cache_array *arr;
> >  	struct kmem_cache *s = NULL;
> >  	char *cache_name;
> > @@ -508,9 +508,10 @@ void memcg_create_kmem_cache(struct mem_cgroup *memcg,
> >  	if (arr->entries[idx])
> >  		goto out_unlock;
> >  
> > -	cgroup_name(css->cgroup, memcg_name_buf, sizeof(memcg_name_buf));
> > -	cache_name = kasprintf(GFP_KERNEL, "%s(%d:%s)", root_cache->name,
> > -			       css->id, memcg_name_buf);
> > +	cgroup = mem_cgroup_css(memcg)->cgroup;
> > +	cgroup_name(cgroup, memcg_name_buf, sizeof(memcg_name_buf));
> > +	cache_name = kasprintf(GFP_KERNEL, "%s(%lu:%s)", root_cache->name,
> > +			(unsigned long)cgroup_ino(cgroup), memcg_name_buf);
> >  	if (!cache_name)
> >  		goto out_unlock;
> 
> Is this interface documented anywhere?
> 

No. Although the /sys/kernel/slab/ tunables are documented in
Documentation/ABI/testing/sysfs-kernel-slab and the /sys/kernel/slab/
directory is mentioned in Documentation/vm/slub.txt, neither of these
files refer to the interface for per memcg caches. I can document it if
necessary.

Come to think of it, was it really a good idea to group per memcg caches
under /sys/kernel/slab/<slab-name>/cgroup/ instead of keeping them all
in /sys/kernel/slab/? I introduced this cgroup/ directory to clean up
/sys/kernel/<slab-name>/ (9a41707bd3a08), which had looked too crowded
when there had been a lot of active memory cgroups. Unfortunately,
nobody commented on that patch at that time. Frankly, today I am not
that sure it was the right thing to do :-(

E.g.

/sys/kernel/slab/<slab-name>/objects (counts allocated objects)

does NOT include

/sys/kernel/slab/<slab-name>/cgroup/*/objects

which looks dubious to me, because this cgroup/ dir implies a
hierarchical structure, while in fact it does not act like that.

Another unpleasant thing about this cgroup/ dir is that it reveals the
internal implementation of memcg/kmem: it shows that each memory cgroup
has its own copy of kmem cache. What if we decide to share the same kmem
cache among all memory cgroups one day? Of course, this will hardly ever
happen, but it is an alternative approach to implementing the same
feature, which makes this cgroup/ dir pointless. If we had all caches
under /sys/kernel/slab, it would not be a problem: the dirs
corresponding to per memcg caches would disappear then, but it would not
break userspace, which would have to treat per memcg caches just like
global ones - e.g. the slabinfo utility would just show less caches,
while if it supported the cgroup/ dir (which it currently does not), it
would require reworking.

Provided that this cgroup/ dir has never been documented and it is only
added if CONFIG_MEMCG_KMEM, which had been marked as UNDER DEVELOPMENT
until recently, is on, can we probably revert it?

Thanks,
Vladimir

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH -mm] slab: use cgroup ino for naming per memcg caches
@ 2015-04-08  9:54     ` Vladimir Davydov
  0 siblings, 0 replies; 13+ messages in thread
From: Vladimir Davydov @ 2015-04-08  9:54 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Johannes Weiner, Michal Hocko, Christoph Lameter, Pekka Enberg,
	David Rientjes, Joonsoo Kim, linux-mm, cgroups, linux-kernel

On Tue, Apr 07, 2015 at 01:38:19PM -0700, Andrew Morton wrote:
> On Tue, 7 Apr 2015 16:53:18 +0300 Vladimir Davydov <vdavydov@parallels.com> wrote:
> 
> > The name of a per memcg kmem cache consists of three parts: the global
> > kmem cache name, the cgroup name, and the css id. The latter is used to
> > guarantee cache name uniqueness.
> > 
> > Since css ids are opaque to the userspace, in general it is impossible
> > to find a cache's owner cgroup given its name: there might be several
> > same-named cgroups with different parents so that their caches' names
> > will only differ by css id. Looking up the owner cgroup by a cache name,
> > however, could be useful for debugging. For instance, the cache name is
> > dumped to dmesg on a slab allocation failure. Another example is
> > /sys/kernel/slab, which exports some extra info/tunables for SLUB caches
> 
> /proc/sys/kernel/slab?

No, /sys/kernel/slab/. There is a directory with tunables for each
global cache there (only for SLUB). If CONFIG_MEMCG_KMEM is on, there is
also /sys/kernel/slab/<slab-name>/cgroup/, which contains directories
with tunables for each per memcg cache.

> 
> > referring to them by name.
> > 
> > This patch substitutes the css id with cgroup inode number, which, just
> > like css id, is reserved until css free, so that the cache names are
> > still guaranteed to be unique, but, in contrast to css id, it can be
> > easily obtained from userspace.
> > 
> > ...
> >
> > --- a/mm/slab_common.c
> > +++ b/mm/slab_common.c
> > @@ -478,7 +478,7 @@ void memcg_create_kmem_cache(struct mem_cgroup *memcg,
> >  			     struct kmem_cache *root_cache)
> >  {
> >  	static char memcg_name_buf[NAME_MAX + 1]; /* protected by slab_mutex */
> > -	struct cgroup_subsys_state *css = mem_cgroup_css(memcg);
> > +	struct cgroup *cgroup;
> >  	struct memcg_cache_array *arr;
> >  	struct kmem_cache *s = NULL;
> >  	char *cache_name;
> > @@ -508,9 +508,10 @@ void memcg_create_kmem_cache(struct mem_cgroup *memcg,
> >  	if (arr->entries[idx])
> >  		goto out_unlock;
> >  
> > -	cgroup_name(css->cgroup, memcg_name_buf, sizeof(memcg_name_buf));
> > -	cache_name = kasprintf(GFP_KERNEL, "%s(%d:%s)", root_cache->name,
> > -			       css->id, memcg_name_buf);
> > +	cgroup = mem_cgroup_css(memcg)->cgroup;
> > +	cgroup_name(cgroup, memcg_name_buf, sizeof(memcg_name_buf));
> > +	cache_name = kasprintf(GFP_KERNEL, "%s(%lu:%s)", root_cache->name,
> > +			(unsigned long)cgroup_ino(cgroup), memcg_name_buf);
> >  	if (!cache_name)
> >  		goto out_unlock;
> 
> Is this interface documented anywhere?
> 

No. Although the /sys/kernel/slab/ tunables are documented in
Documentation/ABI/testing/sysfs-kernel-slab and the /sys/kernel/slab/
directory is mentioned in Documentation/vm/slub.txt, neither of these
files refer to the interface for per memcg caches. I can document it if
necessary.

Come to think of it, was it really a good idea to group per memcg caches
under /sys/kernel/slab/<slab-name>/cgroup/ instead of keeping them all
in /sys/kernel/slab/? I introduced this cgroup/ directory to clean up
/sys/kernel/<slab-name>/ (9a41707bd3a08), which had looked too crowded
when there had been a lot of active memory cgroups. Unfortunately,
nobody commented on that patch at that time. Frankly, today I am not
that sure it was the right thing to do :-(

E.g.

/sys/kernel/slab/<slab-name>/objects (counts allocated objects)

does NOT include

/sys/kernel/slab/<slab-name>/cgroup/*/objects

which looks dubious to me, because this cgroup/ dir implies a
hierarchical structure, while in fact it does not act like that.

Another unpleasant thing about this cgroup/ dir is that it reveals the
internal implementation of memcg/kmem: it shows that each memory cgroup
has its own copy of kmem cache. What if we decide to share the same kmem
cache among all memory cgroups one day? Of course, this will hardly ever
happen, but it is an alternative approach to implementing the same
feature, which makes this cgroup/ dir pointless. If we had all caches
under /sys/kernel/slab, it would not be a problem: the dirs
corresponding to per memcg caches would disappear then, but it would not
break userspace, which would have to treat per memcg caches just like
global ones - e.g. the slabinfo utility would just show less caches,
while if it supported the cgroup/ dir (which it currently does not), it
would require reworking.

Provided that this cgroup/ dir has never been documented and it is only
added if CONFIG_MEMCG_KMEM, which had been marked as UNDER DEVELOPMENT
until recently, is on, can we probably revert it?

Thanks,
Vladimir

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH -mm] slab: use cgroup ino for naming per memcg caches
@ 2015-04-08  9:54     ` Vladimir Davydov
  0 siblings, 0 replies; 13+ messages in thread
From: Vladimir Davydov @ 2015-04-08  9:54 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Johannes Weiner, Michal Hocko, Christoph Lameter, Pekka Enberg,
	David Rientjes, Joonsoo Kim, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Tue, Apr 07, 2015 at 01:38:19PM -0700, Andrew Morton wrote:
> On Tue, 7 Apr 2015 16:53:18 +0300 Vladimir Davydov <vdavydov-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org> wrote:
> 
> > The name of a per memcg kmem cache consists of three parts: the global
> > kmem cache name, the cgroup name, and the css id. The latter is used to
> > guarantee cache name uniqueness.
> > 
> > Since css ids are opaque to the userspace, in general it is impossible
> > to find a cache's owner cgroup given its name: there might be several
> > same-named cgroups with different parents so that their caches' names
> > will only differ by css id. Looking up the owner cgroup by a cache name,
> > however, could be useful for debugging. For instance, the cache name is
> > dumped to dmesg on a slab allocation failure. Another example is
> > /sys/kernel/slab, which exports some extra info/tunables for SLUB caches
> 
> /proc/sys/kernel/slab?

No, /sys/kernel/slab/. There is a directory with tunables for each
global cache there (only for SLUB). If CONFIG_MEMCG_KMEM is on, there is
also /sys/kernel/slab/<slab-name>/cgroup/, which contains directories
with tunables for each per memcg cache.

> 
> > referring to them by name.
> > 
> > This patch substitutes the css id with cgroup inode number, which, just
> > like css id, is reserved until css free, so that the cache names are
> > still guaranteed to be unique, but, in contrast to css id, it can be
> > easily obtained from userspace.
> > 
> > ...
> >
> > --- a/mm/slab_common.c
> > +++ b/mm/slab_common.c
> > @@ -478,7 +478,7 @@ void memcg_create_kmem_cache(struct mem_cgroup *memcg,
> >  			     struct kmem_cache *root_cache)
> >  {
> >  	static char memcg_name_buf[NAME_MAX + 1]; /* protected by slab_mutex */
> > -	struct cgroup_subsys_state *css = mem_cgroup_css(memcg);
> > +	struct cgroup *cgroup;
> >  	struct memcg_cache_array *arr;
> >  	struct kmem_cache *s = NULL;
> >  	char *cache_name;
> > @@ -508,9 +508,10 @@ void memcg_create_kmem_cache(struct mem_cgroup *memcg,
> >  	if (arr->entries[idx])
> >  		goto out_unlock;
> >  
> > -	cgroup_name(css->cgroup, memcg_name_buf, sizeof(memcg_name_buf));
> > -	cache_name = kasprintf(GFP_KERNEL, "%s(%d:%s)", root_cache->name,
> > -			       css->id, memcg_name_buf);
> > +	cgroup = mem_cgroup_css(memcg)->cgroup;
> > +	cgroup_name(cgroup, memcg_name_buf, sizeof(memcg_name_buf));
> > +	cache_name = kasprintf(GFP_KERNEL, "%s(%lu:%s)", root_cache->name,
> > +			(unsigned long)cgroup_ino(cgroup), memcg_name_buf);
> >  	if (!cache_name)
> >  		goto out_unlock;
> 
> Is this interface documented anywhere?
> 

No. Although the /sys/kernel/slab/ tunables are documented in
Documentation/ABI/testing/sysfs-kernel-slab and the /sys/kernel/slab/
directory is mentioned in Documentation/vm/slub.txt, neither of these
files refer to the interface for per memcg caches. I can document it if
necessary.

Come to think of it, was it really a good idea to group per memcg caches
under /sys/kernel/slab/<slab-name>/cgroup/ instead of keeping them all
in /sys/kernel/slab/? I introduced this cgroup/ directory to clean up
/sys/kernel/<slab-name>/ (9a41707bd3a08), which had looked too crowded
when there had been a lot of active memory cgroups. Unfortunately,
nobody commented on that patch at that time. Frankly, today I am not
that sure it was the right thing to do :-(

E.g.

/sys/kernel/slab/<slab-name>/objects (counts allocated objects)

does NOT include

/sys/kernel/slab/<slab-name>/cgroup/*/objects

which looks dubious to me, because this cgroup/ dir implies a
hierarchical structure, while in fact it does not act like that.

Another unpleasant thing about this cgroup/ dir is that it reveals the
internal implementation of memcg/kmem: it shows that each memory cgroup
has its own copy of kmem cache. What if we decide to share the same kmem
cache among all memory cgroups one day? Of course, this will hardly ever
happen, but it is an alternative approach to implementing the same
feature, which makes this cgroup/ dir pointless. If we had all caches
under /sys/kernel/slab, it would not be a problem: the dirs
corresponding to per memcg caches would disappear then, but it would not
break userspace, which would have to treat per memcg caches just like
global ones - e.g. the slabinfo utility would just show less caches,
while if it supported the cgroup/ dir (which it currently does not), it
would require reworking.

Provided that this cgroup/ dir has never been documented and it is only
added if CONFIG_MEMCG_KMEM, which had been marked as UNDER DEVELOPMENT
until recently, is on, can we probably revert it?

Thanks,
Vladimir

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH -mm] slab: use cgroup ino for naming per memcg caches
  2015-04-08  9:54     ` Vladimir Davydov
@ 2015-04-08 13:46       ` Christoph Lameter
  -1 siblings, 0 replies; 13+ messages in thread
From: Christoph Lameter @ 2015-04-08 13:46 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: Andrew Morton, Johannes Weiner, Michal Hocko, Pekka Enberg,
	David Rientjes, Joonsoo Kim, linux-mm, cgroups, linux-kernel

On Wed, 8 Apr 2015, Vladimir Davydov wrote:

> has its own copy of kmem cache. What if we decide to share the same kmem
> cache among all memory cgroups one day? Of course, this will hardly ever
> happen, but it is an alternative approach to implementing the same

/sys/kernel/slab already supports the use of symlinks. And both SLAB and
SLUB do slab merging which means effectively an aliasing of multiple slab
caches to the same name.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH -mm] slab: use cgroup ino for naming per memcg caches
@ 2015-04-08 13:46       ` Christoph Lameter
  0 siblings, 0 replies; 13+ messages in thread
From: Christoph Lameter @ 2015-04-08 13:46 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: Andrew Morton, Johannes Weiner, Michal Hocko, Pekka Enberg,
	David Rientjes, Joonsoo Kim, linux-mm, cgroups, linux-kernel

On Wed, 8 Apr 2015, Vladimir Davydov wrote:

> has its own copy of kmem cache. What if we decide to share the same kmem
> cache among all memory cgroups one day? Of course, this will hardly ever
> happen, but it is an alternative approach to implementing the same

/sys/kernel/slab already supports the use of symlinks. And both SLAB and
SLUB do slab merging which means effectively an aliasing of multiple slab
caches to the same name.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH -mm] slab: use cgroup ino for naming per memcg caches
  2015-04-08 13:46       ` Christoph Lameter
@ 2015-04-08 18:19         ` Vladimir Davydov
  -1 siblings, 0 replies; 13+ messages in thread
From: Vladimir Davydov @ 2015-04-08 18:19 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Andrew Morton, Johannes Weiner, Michal Hocko, Pekka Enberg,
	David Rientjes, Joonsoo Kim, linux-mm, cgroups, linux-kernel

On Wed, Apr 08, 2015 at 08:46:22AM -0500, Christoph Lameter wrote:
> On Wed, 8 Apr 2015, Vladimir Davydov wrote:
> 
> > has its own copy of kmem cache. What if we decide to share the same kmem
> > cache among all memory cgroups one day? Of course, this will hardly ever
> > happen, but it is an alternative approach to implementing the same
> 
> /sys/kernel/slab already supports the use of symlinks. And both SLAB and
> SLUB do slab merging which means effectively an aliasing of multiple slab
> caches to the same name.

Yeah, I think cache merging is a good argument for grouping memcg caches
under /sys/kernel/slab/<slab-name>/cgroup/. We cannot maintain symlinks
for merged memcg caches, because when a memcg cache is created we do not
have names of caches the new cache is merged with. If memcg caches were
listed under /sys/kernel/slab/ along with global ones, absence of the
symlinks would lead to confusion.

Thanks,
Vladimir

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH -mm] slab: use cgroup ino for naming per memcg caches
@ 2015-04-08 18:19         ` Vladimir Davydov
  0 siblings, 0 replies; 13+ messages in thread
From: Vladimir Davydov @ 2015-04-08 18:19 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Andrew Morton, Johannes Weiner, Michal Hocko, Pekka Enberg,
	David Rientjes, Joonsoo Kim, linux-mm, cgroups, linux-kernel

On Wed, Apr 08, 2015 at 08:46:22AM -0500, Christoph Lameter wrote:
> On Wed, 8 Apr 2015, Vladimir Davydov wrote:
> 
> > has its own copy of kmem cache. What if we decide to share the same kmem
> > cache among all memory cgroups one day? Of course, this will hardly ever
> > happen, but it is an alternative approach to implementing the same
> 
> /sys/kernel/slab already supports the use of symlinks. And both SLAB and
> SLUB do slab merging which means effectively an aliasing of multiple slab
> caches to the same name.

Yeah, I think cache merging is a good argument for grouping memcg caches
under /sys/kernel/slab/<slab-name>/cgroup/. We cannot maintain symlinks
for merged memcg caches, because when a memcg cache is created we do not
have names of caches the new cache is merged with. If memcg caches were
listed under /sys/kernel/slab/ along with global ones, absence of the
symlinks would lead to confusion.

Thanks,
Vladimir

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH -mm] slab: use cgroup ino for naming per memcg caches
  2015-04-08 18:19         ` Vladimir Davydov
@ 2015-04-08 18:24           ` Christoph Lameter
  -1 siblings, 0 replies; 13+ messages in thread
From: Christoph Lameter @ 2015-04-08 18:24 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: Andrew Morton, Johannes Weiner, Michal Hocko, Pekka Enberg,
	David Rientjes, Joonsoo Kim, linux-mm, cgroups, linux-kernel

On Wed, 8 Apr 2015, Vladimir Davydov wrote:

> Yeah, I think cache merging is a good argument for grouping memcg caches
> under /sys/kernel/slab/<slab-name>/cgroup/. We cannot maintain symlinks
> for merged memcg caches, because when a memcg cache is created we do not
> have names of caches the new cache is merged with. If memcg caches were
> listed under /sys/kernel/slab/ along with global ones, absence of the
> symlinks would lead to confusion.

The point of the unique name creation is to not have to use the name given
by the user for the slab. You can generate a unique identifier and use
that as a target for the symlink.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH -mm] slab: use cgroup ino for naming per memcg caches
@ 2015-04-08 18:24           ` Christoph Lameter
  0 siblings, 0 replies; 13+ messages in thread
From: Christoph Lameter @ 2015-04-08 18:24 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: Andrew Morton, Johannes Weiner, Michal Hocko, Pekka Enberg,
	David Rientjes, Joonsoo Kim, linux-mm, cgroups, linux-kernel

On Wed, 8 Apr 2015, Vladimir Davydov wrote:

> Yeah, I think cache merging is a good argument for grouping memcg caches
> under /sys/kernel/slab/<slab-name>/cgroup/. We cannot maintain symlinks
> for merged memcg caches, because when a memcg cache is created we do not
> have names of caches the new cache is merged with. If memcg caches were
> listed under /sys/kernel/slab/ along with global ones, absence of the
> symlinks would lead to confusion.

The point of the unique name creation is to not have to use the name given
by the user for the slab. You can generate a unique identifier and use
that as a target for the symlink.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2015-04-08 18:24 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-07 13:53 [PATCH -mm] slab: use cgroup ino for naming per memcg caches Vladimir Davydov
2015-04-07 13:53 ` Vladimir Davydov
2015-04-07 20:38 ` Andrew Morton
2015-04-07 20:38   ` Andrew Morton
2015-04-08  9:54   ` Vladimir Davydov
2015-04-08  9:54     ` Vladimir Davydov
2015-04-08  9:54     ` Vladimir Davydov
2015-04-08 13:46     ` Christoph Lameter
2015-04-08 13:46       ` Christoph Lameter
2015-04-08 18:19       ` Vladimir Davydov
2015-04-08 18:19         ` Vladimir Davydov
2015-04-08 18:24         ` Christoph Lameter
2015-04-08 18:24           ` Christoph Lameter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.