linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH -mm 0/6] memcg: release memcg_cache_id on css offline
@ 2014-07-21 11:47 Vladimir Davydov
  2014-07-21 11:47 ` [PATCH -mm 1/6] slub: remove kmemcg id from create_unique_id Vladimir Davydov
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: Vladimir Davydov @ 2014-07-21 11:47 UTC (permalink / raw)
  To: akpm; +Cc: mhocko, hannes, cl, linux-mm, linux-kernel

Hi,

Currently memcg_cache_id (mem_cgroup->kmemcg_id), which is used for
indexing memcg_caches arrays, is released only on css free. As a result,
offline css, whose number is actually limited only by amount of free
RAM, will occupy slots in these arrays making them grow larger and
larger even if there's only a few kmem active memory cgroups out there.

This patch set makes memcg release memcg_cache_id on css offline. This
way the memcg_caches arrays size will be limited by the number of alive
kmem-active memory cgroups, which is much better.

The work is actually done in patch 6 while patches 1-5 only prepare
memcg and slab subsystems to this change.

Thanks,

Vladimir Davydov (6):
  slub: remove kmemcg id from create_unique_id
  slab: use mem_cgroup_id for per memcg cache naming
  memcg: make memcg_cache_id static
  memcg: add pointer to owner cache to memcg_cache_params
  memcg: keep all children of each root cache on a list
  memcg: release memcg_cache_id on css offline

 include/linux/memcontrol.h |    9 +---
 include/linux/slab.h       |    7 ++-
 mm/memcontrol.c            |  112 +++++++++++++++++++++++++++-----------------
 mm/slab.c                  |   40 +++++++++-------
 mm/slab_common.c           |   44 ++++++++---------
 mm/slub.c                  |   45 +++++++++---------
 6 files changed, 140 insertions(+), 117 deletions(-)

-- 
1.7.10.4


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH -mm 1/6] slub: remove kmemcg id from create_unique_id
  2014-07-21 11:47 [PATCH -mm 0/6] memcg: release memcg_cache_id on css offline Vladimir Davydov
@ 2014-07-21 11:47 ` Vladimir Davydov
  2014-07-21 11:47 ` [PATCH -mm 2/6] slab: use mem_cgroup_id for per memcg cache naming Vladimir Davydov
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Vladimir Davydov @ 2014-07-21 11:47 UTC (permalink / raw)
  To: akpm; +Cc: mhocko, hannes, cl, linux-mm, linux-kernel

This function is never called for memcg caches, because they are
unmergeable, so remove the dead code.

Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
---
 mm/slub.c |    6 ------
 1 file changed, 6 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index 2b068c3638aa..a1cdbad02f0c 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -5130,12 +5130,6 @@ static char *create_unique_id(struct kmem_cache *s)
 		*p++ = '-';
 	p += sprintf(p, "%07d", s->size);
 
-#ifdef CONFIG_MEMCG_KMEM
-	if (!is_root_cache(s))
-		p += sprintf(p, "-%08d",
-				memcg_cache_id(s->memcg_params->memcg));
-#endif
-
 	BUG_ON(p > name + ID_STR_LENGTH - 1);
 	return name;
 }
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH -mm 2/6] slab: use mem_cgroup_id for per memcg cache naming
  2014-07-21 11:47 [PATCH -mm 0/6] memcg: release memcg_cache_id on css offline Vladimir Davydov
  2014-07-21 11:47 ` [PATCH -mm 1/6] slub: remove kmemcg id from create_unique_id Vladimir Davydov
@ 2014-07-21 11:47 ` Vladimir Davydov
  2014-07-21 11:47 ` [PATCH -mm 3/6] memcg: make memcg_cache_id static Vladimir Davydov
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Vladimir Davydov @ 2014-07-21 11:47 UTC (permalink / raw)
  To: akpm; +Cc: mhocko, hannes, cl, linux-mm, linux-kernel

Currently, we use memcg_cache_id as a part of a per memcg cache name.
Since memcg_cache_id is released only on css free, this guarantees cache
name uniqueness.

However, it's a bad practice to keep memcg_cache_id till css free,
because it occupies a slot in kmem_cache->memcg_params->memcg_caches
arrays. So I'm going to make memcg release memcg_cache_id on css
offline. As a result, memcg_cache_id won't guarantee cache name
uniqueness any more.

Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
---
 include/linux/slab.h |    2 +-
 mm/memcontrol.c      |   13 +++++++++++--
 mm/slab_common.c     |   15 +++------------
 3 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 1d9abb7d22a0..14888328e96b 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -118,7 +118,7 @@ struct kmem_cache *kmem_cache_create(const char *, size_t, size_t,
 #ifdef CONFIG_MEMCG_KMEM
 struct kmem_cache *memcg_create_kmem_cache(struct mem_cgroup *,
 					   struct kmem_cache *,
-					   const char *);
+					   char *);
 #endif
 void kmem_cache_destroy(struct kmem_cache *);
 int kmem_cache_shrink(struct kmem_cache *);
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 85dd94f2ecce..7d5c4a5e4c74 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3003,6 +3003,7 @@ static void memcg_register_cache(struct mem_cgroup *memcg,
 	static char memcg_name_buf[NAME_MAX + 1]; /* protected by
 						     memcg_slab_mutex */
 	struct kmem_cache *cachep;
+	char *cache_name;
 	int id;
 
 	lockdep_assert_held(&memcg_slab_mutex);
@@ -3018,14 +3019,22 @@ static void memcg_register_cache(struct mem_cgroup *memcg,
 		return;
 
 	cgroup_name(memcg->css.cgroup, memcg_name_buf, NAME_MAX + 1);
-	cachep = memcg_create_kmem_cache(memcg, root_cache, memcg_name_buf);
+
+	cache_name = kasprintf(GFP_KERNEL, "%s(%d:%s)", root_cache->name,
+			       mem_cgroup_id(memcg), memcg_name_buf);
+	if (!cache_name)
+		return;
+
+	cachep = memcg_create_kmem_cache(memcg, root_cache, cache_name);
 	/*
 	 * If we could not create a memcg cache, do not complain, because
 	 * that's not critical at all as we can always proceed with the root
 	 * cache.
 	 */
-	if (!cachep)
+	if (!cachep) {
+		kfree(cache_name);
 		return;
+	}
 
 	list_add(&cachep->memcg_params->list, &memcg->memcg_slab_caches);
 
diff --git a/mm/slab_common.c b/mm/slab_common.c
index d319502b2403..a847fc86ac32 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -266,7 +266,7 @@ EXPORT_SYMBOL(kmem_cache_create);
  * memcg_create_kmem_cache - Create a cache for a memory cgroup.
  * @memcg: The memory cgroup the new cache is for.
  * @root_cache: The parent of the new cache.
- * @memcg_name: The name of the memory cgroup (used for naming the new cache).
+ * @cache_name: The string to be used as the new cache name.
  *
  * This function attempts to create a kmem cache that will serve allocation
  * requests going from @memcg to @root_cache. The new cache inherits properties
@@ -274,31 +274,22 @@ EXPORT_SYMBOL(kmem_cache_create);
  */
 struct kmem_cache *memcg_create_kmem_cache(struct mem_cgroup *memcg,
 					   struct kmem_cache *root_cache,
-					   const char *memcg_name)
+					   char *cache_name)
 {
 	struct kmem_cache *s = NULL;
-	char *cache_name;
 
 	get_online_cpus();
 	get_online_mems();
 
 	mutex_lock(&slab_mutex);
 
-	cache_name = kasprintf(GFP_KERNEL, "%s(%d:%s)", root_cache->name,
-			       memcg_cache_id(memcg), memcg_name);
-	if (!cache_name)
-		goto out_unlock;
-
 	s = do_kmem_cache_create(cache_name, root_cache->object_size,
 				 root_cache->size, root_cache->align,
 				 root_cache->flags, root_cache->ctor,
 				 memcg, root_cache);
-	if (IS_ERR(s)) {
-		kfree(cache_name);
+	if (IS_ERR(s))
 		s = NULL;
-	}
 
-out_unlock:
 	mutex_unlock(&slab_mutex);
 
 	put_online_mems();
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH -mm 3/6] memcg: make memcg_cache_id static
  2014-07-21 11:47 [PATCH -mm 0/6] memcg: release memcg_cache_id on css offline Vladimir Davydov
  2014-07-21 11:47 ` [PATCH -mm 1/6] slub: remove kmemcg id from create_unique_id Vladimir Davydov
  2014-07-21 11:47 ` [PATCH -mm 2/6] slab: use mem_cgroup_id for per memcg cache naming Vladimir Davydov
@ 2014-07-21 11:47 ` Vladimir Davydov
  2014-07-21 11:47 ` [PATCH -mm 4/6] memcg: add pointer to owner cache to memcg_cache_params Vladimir Davydov
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Vladimir Davydov @ 2014-07-21 11:47 UTC (permalink / raw)
  To: akpm; +Cc: mhocko, hannes, cl, linux-mm, linux-kernel

It's not used anywhere outside mm/memcontrol.c.

Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
---
 include/linux/memcontrol.h |    7 -------
 mm/memcontrol.c            |   20 ++++++++++----------
 2 files changed, 10 insertions(+), 17 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index e0752d204d9e..4b4a26725cbb 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -438,8 +438,6 @@ void __memcg_kmem_commit_charge(struct page *page,
 				       struct mem_cgroup *memcg, int order);
 void __memcg_kmem_uncharge_pages(struct page *page, int order);
 
-int memcg_cache_id(struct mem_cgroup *memcg);
-
 int memcg_alloc_cache_params(struct mem_cgroup *memcg, struct kmem_cache *s,
 			     struct kmem_cache *root_cache);
 void memcg_free_cache_params(struct kmem_cache *s);
@@ -569,11 +567,6 @@ memcg_kmem_commit_charge(struct page *page, struct mem_cgroup *memcg, int order)
 {
 }
 
-static inline int memcg_cache_id(struct mem_cgroup *memcg)
-{
-	return -1;
-}
-
 static inline int memcg_alloc_cache_params(struct mem_cgroup *memcg,
 		struct kmem_cache *s, struct kmem_cache *root_cache)
 {
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 7d5c4a5e4c74..cc1064a504cc 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2781,6 +2781,16 @@ static inline bool memcg_can_account_kmem(struct mem_cgroup *memcg)
 }
 
 /*
+ * helper for acessing a memcg's index. It will be used as an index in the
+ * child cache array in kmem_cache, and also to derive its name. This function
+ * will return -1 when this is not a kmem-limited memcg.
+ */
+static inline int memcg_cache_id(struct mem_cgroup *memcg)
+{
+	return memcg ? memcg->kmemcg_id : -1;
+}
+
+/*
  * This is a bit cumbersome, but it is rarely used and avoids a backpointer
  * in the memcg_cache_params struct.
  */
@@ -2872,16 +2882,6 @@ static void memcg_uncharge_kmem(struct mem_cgroup *memcg, u64 size)
 		css_put(&memcg->css);
 }
 
-/*
- * helper for acessing a memcg's index. It will be used as an index in the
- * child cache array in kmem_cache, and also to derive its name. This function
- * will return -1 when this is not a kmem-limited memcg.
- */
-int memcg_cache_id(struct mem_cgroup *memcg)
-{
-	return memcg ? memcg->kmemcg_id : -1;
-}
-
 static size_t memcg_caches_array_size(int num_groups)
 {
 	ssize_t size;
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH -mm 4/6] memcg: add pointer to owner cache to memcg_cache_params
  2014-07-21 11:47 [PATCH -mm 0/6] memcg: release memcg_cache_id on css offline Vladimir Davydov
                   ` (2 preceding siblings ...)
  2014-07-21 11:47 ` [PATCH -mm 3/6] memcg: make memcg_cache_id static Vladimir Davydov
@ 2014-07-21 11:47 ` Vladimir Davydov
  2014-07-21 11:47 ` [PATCH -mm 5/6] memcg: keep all children of each root cache on a list Vladimir Davydov
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Vladimir Davydov @ 2014-07-21 11:47 UTC (permalink / raw)
  To: akpm; +Cc: mhocko, hannes, cl, linux-mm, linux-kernel

We don't keep a pointer to the owner kmem cache in the
memcg_cache_params struct, because we can always get the cache by
reading the slot corresponding to the owner memcg in the root cache's
memcg_caches array (see memcg_params_to_cache).

However, this means that offline css's, which can be zombieing around
for quite a long time, will occupy slots in memcg_caches arrays, making
them grow larger and larger, which doesn't sound good. Therefore I'm
going to make memcg release the slots on offline, which will render
memcg_params_to_cache invalid. So I'm removing it and adding a back
pointer to memcg_cache_params instead.

Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
---
 include/linux/slab.h |    2 ++
 mm/memcontrol.c      |   20 ++++----------------
 2 files changed, 6 insertions(+), 16 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 14888328e96b..e6e6ddb769c7 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -523,6 +523,7 @@ static __always_inline void *kmalloc_node(size_t size, gfp_t flags, int node)
  *
  * Child caches will hold extra metadata needed for its operation. Fields are:
  *
+ * @cachep: cache which this struct is for
  * @memcg: pointer to the memcg this cache belongs to
  * @list: list_head for the list of all caches in this memcg
  * @root_cache: pointer to the global, root cache, this cache was derived from
@@ -536,6 +537,7 @@ struct memcg_cache_params {
 			struct kmem_cache *memcg_caches[0];
 		};
 		struct {
+			struct kmem_cache *cachep;
 			struct mem_cgroup *memcg;
 			struct list_head list;
 			struct kmem_cache *root_cache;
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index cc1064a504cc..aa3111ac3b7e 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2790,19 +2790,6 @@ static inline int memcg_cache_id(struct mem_cgroup *memcg)
 	return memcg ? memcg->kmemcg_id : -1;
 }
 
-/*
- * This is a bit cumbersome, but it is rarely used and avoids a backpointer
- * in the memcg_cache_params struct.
- */
-static struct kmem_cache *memcg_params_to_cache(struct memcg_cache_params *p)
-{
-	struct kmem_cache *cachep;
-
-	VM_BUG_ON(p->is_root_cache);
-	cachep = p->root_cache;
-	return cache_from_memcg_idx(cachep, memcg_cache_id(p->memcg));
-}
-
 #ifdef CONFIG_SLABINFO
 static int mem_cgroup_slabinfo_read(struct seq_file *m, void *v)
 {
@@ -2816,7 +2803,7 @@ static int mem_cgroup_slabinfo_read(struct seq_file *m, void *v)
 
 	mutex_lock(&memcg_slab_mutex);
 	list_for_each_entry(params, &memcg->memcg_slab_caches, list)
-		cache_show(memcg_params_to_cache(params), m);
+		cache_show(params->cachep, m);
 	mutex_unlock(&memcg_slab_mutex);
 
 	return 0;
@@ -2979,6 +2966,7 @@ int memcg_alloc_cache_params(struct mem_cgroup *memcg, struct kmem_cache *s,
 		return -ENOMEM;
 
 	if (memcg) {
+		s->memcg_params->cachep = s;
 		s->memcg_params->memcg = memcg;
 		s->memcg_params->root_cache = root_cache;
 		css_get(&memcg->css);
@@ -3124,7 +3112,6 @@ int __memcg_cleanup_cache_params(struct kmem_cache *s)
 
 static void memcg_unregister_all_caches(struct mem_cgroup *memcg)
 {
-	struct kmem_cache *cachep;
 	struct memcg_cache_params *params, *tmp;
 
 	if (!memcg_kmem_is_active(memcg))
@@ -3132,7 +3119,8 @@ static void memcg_unregister_all_caches(struct mem_cgroup *memcg)
 
 	mutex_lock(&memcg_slab_mutex);
 	list_for_each_entry_safe(params, tmp, &memcg->memcg_slab_caches, list) {
-		cachep = memcg_params_to_cache(params);
+		struct kmem_cache *cachep = params->cachep;
+
 		kmem_cache_shrink(cachep);
 		if (atomic_read(&cachep->memcg_params->nr_pages) == 0)
 			memcg_unregister_cache(cachep);
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH -mm 5/6] memcg: keep all children of each root cache on a list
  2014-07-21 11:47 [PATCH -mm 0/6] memcg: release memcg_cache_id on css offline Vladimir Davydov
                   ` (3 preceding siblings ...)
  2014-07-21 11:47 ` [PATCH -mm 4/6] memcg: add pointer to owner cache to memcg_cache_params Vladimir Davydov
@ 2014-07-21 11:47 ` Vladimir Davydov
  2014-07-21 11:47 ` [PATCH -mm 6/6] memcg: release memcg_cache_id on css offline Vladimir Davydov
  2014-07-23 10:53 ` [PATCH -mm 0/6] " Vladimir Davydov
  6 siblings, 0 replies; 8+ messages in thread
From: Vladimir Davydov @ 2014-07-21 11:47 UTC (permalink / raw)
  To: akpm; +Cc: mhocko, hannes, cl, linux-mm, linux-kernel

Sometimes we need to iterate over all child caches of a particular root
cache, e.g. when we are destroying it. Currently each root cache keeps
pointers to its children in its memcg_cache_params->memcg_caches_array
so that we can enumerate all active kmemcg ids dereferencing appropriate
array slots to get a memcg.

However, I'm going to make memcg clear the slots on offline to avoid
uncontrollable memcg_caches arrays growth. Hence to iterate over all
memcg caches of a particular root cache we have to link all memcg caches
to per root cache lists.

Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
---
 include/linux/memcontrol.h |    2 +-
 include/linux/slab.h       |    3 +++
 mm/memcontrol.c            |   27 ++++++++++++---------------
 mm/slab.c                  |   40 +++++++++++++++++++++++-----------------
 mm/slab_common.c           |   31 +++++++++++++++++--------------
 mm/slub.c                  |   39 +++++++++++++++++++++++----------------
 6 files changed, 79 insertions(+), 63 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 4b4a26725cbb..c15cb0c9f413 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -451,7 +451,7 @@ __memcg_kmem_get_cache(struct kmem_cache *cachep, gfp_t gfp);
 int __memcg_charge_slab(struct kmem_cache *cachep, gfp_t gfp, int order);
 void __memcg_uncharge_slab(struct kmem_cache *cachep, int order);
 
-int __memcg_cleanup_cache_params(struct kmem_cache *s);
+void __memcg_cleanup_cache_params(struct kmem_cache *s);
 
 /**
  * memcg_kmem_newpage_charge: verify if a new kmem allocation is allowed.
diff --git a/include/linux/slab.h b/include/linux/slab.h
index e6e6ddb769c7..bf94461ca82e 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -527,6 +527,7 @@ static __always_inline void *kmalloc_node(size_t size, gfp_t flags, int node)
  * @memcg: pointer to the memcg this cache belongs to
  * @list: list_head for the list of all caches in this memcg
  * @root_cache: pointer to the global, root cache, this cache was derived from
+ * @siblings: list_head for the list of all child caches of the root_cache
  * @nr_pages: number of pages that belongs to this cache.
  */
 struct memcg_cache_params {
@@ -534,6 +535,7 @@ struct memcg_cache_params {
 	union {
 		struct {
 			struct rcu_head rcu_head;
+			struct list_head children;
 			struct kmem_cache *memcg_caches[0];
 		};
 		struct {
@@ -541,6 +543,7 @@ struct memcg_cache_params {
 			struct mem_cgroup *memcg;
 			struct list_head list;
 			struct kmem_cache *root_cache;
+			struct list_head siblings;
 			atomic_t nr_pages;
 		};
 	};
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index aa3111ac3b7e..3ee37189e57e 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2914,6 +2914,10 @@ int memcg_update_cache_size(struct kmem_cache *s, int num_groups)
 			return -ENOMEM;
 
 		new_params->is_root_cache = true;
+		INIT_LIST_HEAD(&new_params->children);
+		if (cur_params)
+			list_replace(&cur_params->children,
+				     &new_params->children);
 
 		/*
 		 * There is the chance it will be bigger than
@@ -2970,8 +2974,10 @@ int memcg_alloc_cache_params(struct mem_cgroup *memcg, struct kmem_cache *s,
 		s->memcg_params->memcg = memcg;
 		s->memcg_params->root_cache = root_cache;
 		css_get(&memcg->css);
-	} else
+	} else {
 		s->memcg_params->is_root_cache = true;
+		INIT_LIST_HEAD(&s->memcg_params->children);
+	}
 
 	return 0;
 }
@@ -3090,24 +3096,15 @@ static inline void memcg_resume_kmem_account(void)
 	current->memcg_kmem_skip_account--;
 }
 
-int __memcg_cleanup_cache_params(struct kmem_cache *s)
+void __memcg_cleanup_cache_params(struct kmem_cache *s)
 {
-	struct kmem_cache *c;
-	int i, failed = 0;
+	struct memcg_cache_params *params, *tmp;
 
 	mutex_lock(&memcg_slab_mutex);
-	for_each_memcg_cache_index(i) {
-		c = cache_from_memcg_idx(s, i);
-		if (!c)
-			continue;
-
-		memcg_unregister_cache(c);
-
-		if (cache_from_memcg_idx(s, i))
-			failed++;
-	}
+	list_for_each_entry_safe(params, tmp,
+			&s->memcg_params->children, siblings)
+		memcg_unregister_cache(params->cachep);
 	mutex_unlock(&memcg_slab_mutex);
-	return failed;
 }
 
 static void memcg_unregister_all_caches(struct mem_cgroup *memcg)
diff --git a/mm/slab.c b/mm/slab.c
index 1351725f7936..aed36c5b0bd9 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -3780,29 +3780,35 @@ static int __do_tune_cpucache(struct kmem_cache *cachep, int limit,
 	return alloc_kmem_cache_node(cachep, gfp);
 }
 
+static void memcg_do_tune_cpucache(struct kmem_cache *cachep, int limit,
+				   int batchcount, int shared, gfp_t gfp)
+{
+#ifdef CONFIG_MEMCG_KMEM
+	struct memcg_cache_params *params;
+
+	if (!cachep->memcg_params ||
+	    !cachep->memcg_params->is_root_cache)
+		return;
+
+	lockdep_assert_held(&slab_mutex);
+	list_for_each_entry(params,
+			&cachep->memcg_params->children, siblings) {
+		/* return value determined by the parent cache only */
+		__do_tune_cpucache(params->cachep, limit,
+				   batchcount, shared, gfp);
+	}
+#endif
+}
+
 static int do_tune_cpucache(struct kmem_cache *cachep, int limit,
 				int batchcount, int shared, gfp_t gfp)
 {
 	int ret;
-	struct kmem_cache *c = NULL;
-	int i = 0;
 
 	ret = __do_tune_cpucache(cachep, limit, batchcount, shared, gfp);
-
-	if (slab_state < FULL)
-		return ret;
-
-	if ((ret < 0) || !is_root_cache(cachep))
-		return ret;
-
-	VM_BUG_ON(!mutex_is_locked(&slab_mutex));
-	for_each_memcg_cache_index(i) {
-		c = cache_from_memcg_idx(cachep, i);
-		if (c)
-			/* return value determined by the parent cache only */
-			__do_tune_cpucache(c, limit, batchcount, shared, gfp);
-	}
-
+	if (!ret)
+		memcg_do_tune_cpucache(cachep, limit,
+				       batchcount, shared, gfp);
 	return ret;
 }
 
diff --git a/mm/slab_common.c b/mm/slab_common.c
index a847fc86ac32..d80ec43ac4e0 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -287,7 +287,10 @@ struct kmem_cache *memcg_create_kmem_cache(struct mem_cgroup *memcg,
 				 root_cache->size, root_cache->align,
 				 root_cache->flags, root_cache->ctor,
 				 memcg, root_cache);
-	if (IS_ERR(s))
+	if (!IS_ERR(s))
+		list_add(&s->memcg_params->siblings,
+			 &root_cache->memcg_params->children);
+	else
 		s = NULL;
 
 	mutex_unlock(&slab_mutex);
@@ -300,17 +303,15 @@ struct kmem_cache *memcg_create_kmem_cache(struct mem_cgroup *memcg,
 
 static int memcg_cleanup_cache_params(struct kmem_cache *s)
 {
-	int rc;
-
 	if (!s->memcg_params ||
 	    !s->memcg_params->is_root_cache)
 		return 0;
 
 	mutex_unlock(&slab_mutex);
-	rc = __memcg_cleanup_cache_params(s);
+	__memcg_cleanup_cache_params(s);
 	mutex_lock(&slab_mutex);
 
-	return rc;
+	return !list_empty(&s->memcg_params->children);
 }
 #else
 static int memcg_cleanup_cache_params(struct kmem_cache *s)
@@ -347,6 +348,10 @@ void kmem_cache_destroy(struct kmem_cache *s)
 	}
 
 	list_del(&s->list);
+#ifdef CONFIG_MEMCG_KMEM
+	if (!is_root_cache(s))
+		list_del(&s->memcg_params->siblings);
+#endif
 
 	mutex_unlock(&slab_mutex);
 	if (s->flags & SLAB_DESTROY_BY_RCU)
@@ -685,20 +690,17 @@ void slab_stop(struct seq_file *m, void *p)
 static void
 memcg_accumulate_slabinfo(struct kmem_cache *s, struct slabinfo *info)
 {
-	struct kmem_cache *c;
+#ifdef CONFIG_MEMCG_KMEM
+	struct memcg_cache_params *params;
 	struct slabinfo sinfo;
-	int i;
 
-	if (!is_root_cache(s))
+	if (!s->memcg_params ||
+	    !s->memcg_params->is_root_cache)
 		return;
 
-	for_each_memcg_cache_index(i) {
-		c = cache_from_memcg_idx(s, i);
-		if (!c)
-			continue;
-
+	list_for_each_entry(params, &s->memcg_params->children, siblings) {
 		memset(&sinfo, 0, sizeof(sinfo));
-		get_slabinfo(c, &sinfo);
+		get_slabinfo(params->cachep, &sinfo);
 
 		info->active_slabs += sinfo.active_slabs;
 		info->num_slabs += sinfo.num_slabs;
@@ -706,6 +708,7 @@ memcg_accumulate_slabinfo(struct kmem_cache *s, struct slabinfo *info)
 		info->active_objs += sinfo.active_objs;
 		info->num_objs += sinfo.num_objs;
 	}
+#endif
 }
 
 int cache_show(struct kmem_cache *s, struct seq_file *m)
diff --git a/mm/slub.c b/mm/slub.c
index a1cdbad02f0c..4114bebc0b2e 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3690,6 +3690,23 @@ static struct kmem_cache *find_mergeable(size_t size, size_t align,
 	return NULL;
 }
 
+static void memcg_slab_merge(struct kmem_cache *s, size_t size)
+{
+#ifdef CONFIG_MEMCG_KMEM
+	struct kmem_cache *c;
+	struct memcg_cache_params *params;
+
+	if (!s->memcg_params)
+		return;
+
+	list_for_each_entry(params, &s->memcg_params->children, siblings) {
+		c = params->cachep;
+		c->object_size = s->object_size;
+		c->inuse = max_t(int, c->inuse, ALIGN(size, sizeof(void *)));
+	}
+#endif
+}
+
 struct kmem_cache *
 __kmem_cache_alias(const char *name, size_t size, size_t align,
 		   unsigned long flags, void (*ctor)(void *))
@@ -3698,9 +3715,6 @@ __kmem_cache_alias(const char *name, size_t size, size_t align,
 
 	s = find_mergeable(size, align, flags, name, ctor);
 	if (s) {
-		int i;
-		struct kmem_cache *c;
-
 		s->refcount++;
 
 		/*
@@ -3710,14 +3724,7 @@ __kmem_cache_alias(const char *name, size_t size, size_t align,
 		s->object_size = max(s->object_size, (int)size);
 		s->inuse = max_t(int, s->inuse, ALIGN(size, sizeof(void *)));
 
-		for_each_memcg_cache_index(i) {
-			c = cache_from_memcg_idx(s, i);
-			if (!c)
-				continue;
-			c->object_size = s->object_size;
-			c->inuse = max_t(int, c->inuse,
-					 ALIGN(size, sizeof(void *)));
-		}
+		memcg_slab_merge(s, size);
 
 		if (sysfs_slab_alias(s, name)) {
 			s->refcount--;
@@ -4968,7 +4975,7 @@ static ssize_t slab_attr_store(struct kobject *kobj,
 	err = attribute->store(s, buf, len);
 #ifdef CONFIG_MEMCG_KMEM
 	if (slab_state >= FULL && err >= 0 && is_root_cache(s)) {
-		int i;
+		struct memcg_cache_params *params;
 
 		mutex_lock(&slab_mutex);
 		if (s->max_attr_size < len)
@@ -4991,10 +4998,10 @@ static ssize_t slab_attr_store(struct kobject *kobj,
 		 * directly either failed or succeeded, in which case we loop
 		 * through the descendants with best-effort propagation.
 		 */
-		for_each_memcg_cache_index(i) {
-			struct kmem_cache *c = cache_from_memcg_idx(s, i);
-			if (c)
-				attribute->store(c, buf, len);
+		if (s->memcg_params) {
+			list_for_each_entry(params,
+					&s->memcg_params->children, siblings)
+				attribute->store(params->cachep, buf, len);
 		}
 		mutex_unlock(&slab_mutex);
 	}
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH -mm 6/6] memcg: release memcg_cache_id on css offline
  2014-07-21 11:47 [PATCH -mm 0/6] memcg: release memcg_cache_id on css offline Vladimir Davydov
                   ` (4 preceding siblings ...)
  2014-07-21 11:47 ` [PATCH -mm 5/6] memcg: keep all children of each root cache on a list Vladimir Davydov
@ 2014-07-21 11:47 ` Vladimir Davydov
  2014-07-23 10:53 ` [PATCH -mm 0/6] " Vladimir Davydov
  6 siblings, 0 replies; 8+ messages in thread
From: Vladimir Davydov @ 2014-07-21 11:47 UTC (permalink / raw)
  To: akpm; +Cc: mhocko, hannes, cl, linux-mm, linux-kernel

The memcg_cache_id (mem_cgroup->kmemcg_id) is used as the index in root
cache's memcg_cache_params->memcg_caches array. Whenever a new kmem
active cgroup is created we must allocate an id for it. As a result, the
array size must always be greater than or equal to the number of memory
cgroups that have memcg_cache_id assigned to them.

Currently we release the id only on css free. This is bad, because css
can be zombieing around for quite a long time after css offline,
occupying an array slot and making the arrays grow larger and larger.
Although the number of arrays is limited - only root kmem caches have
them - we can still experience problems while creating new kmem active
cgroups, because they might require arrays relocation and each array
relocation will require costly high-order page allocations if there are
a lot of ids allocated. The situation will become even worse when
per-memcg list_lru's are introduced, because each super block has a
list_lru, and the number of super blocks is practically unlimited.

So let's release memcg_cache_id on css offline - there's nothing that
prevents us from doing so.

Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
---
 mm/memcontrol.c |   42 ++++++++++++++++++++++++++++++++++++------
 1 file changed, 36 insertions(+), 6 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 3ee37189e57e..edd951e1e185 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -648,10 +648,8 @@ EXPORT_SYMBOL(memcg_kmem_enabled_key);
 
 static void disarm_kmem_keys(struct mem_cgroup *memcg)
 {
-	if (memcg_kmem_is_active(memcg)) {
+	if (memcg_kmem_is_active(memcg))
 		static_key_slow_dec(&memcg_kmem_enabled_key);
-		ida_simple_remove(&kmem_limited_groups, memcg->kmemcg_id);
-	}
 	/*
 	 * This check can't live in kmem destruction function,
 	 * since the charges will outlive the cgroup
@@ -3003,6 +3001,12 @@ static void memcg_register_cache(struct mem_cgroup *memcg,
 	lockdep_assert_held(&memcg_slab_mutex);
 
 	id = memcg_cache_id(memcg);
+	/*
+	 * The cgroup was taken offline while the create work was pending,
+	 * nothing to do then.
+	 */
+	if (id < 0)
+		return;
 
 	/*
 	 * Since per-memcg caches are created asynchronously on first
@@ -3057,8 +3061,17 @@ static void memcg_unregister_cache(struct kmem_cache *cachep)
 	memcg = cachep->memcg_params->memcg;
 	id = memcg_cache_id(memcg);
 
-	BUG_ON(root_cache->memcg_params->memcg_caches[id] != cachep);
-	root_cache->memcg_params->memcg_caches[id] = NULL;
+	/*
+	 * This function can be called both after and before css offline. If
+	 * it's called before css offline, which happens on the root cache
+	 * destruction, we should clear the slot corresponding to the cache in
+	 * memcg_caches array. Otherwise the slot must have already been
+	 * cleared in memcg_unregister_all_caches.
+	 */
+	if (id >= 0) {
+		BUG_ON(root_cache->memcg_params->memcg_caches[id] != cachep);
+		root_cache->memcg_params->memcg_caches[id] = NULL;
+	}
 
 	list_del(&cachep->memcg_params->list);
 
@@ -3110,19 +3123,27 @@ void __memcg_cleanup_cache_params(struct kmem_cache *s)
 static void memcg_unregister_all_caches(struct mem_cgroup *memcg)
 {
 	struct memcg_cache_params *params, *tmp;
+	int id = memcg_cache_id(memcg);
 
 	if (!memcg_kmem_is_active(memcg))
 		return;
 
 	mutex_lock(&memcg_slab_mutex);
+	memcg->kmemcg_id = -1;
 	list_for_each_entry_safe(params, tmp, &memcg->memcg_slab_caches, list) {
 		struct kmem_cache *cachep = params->cachep;
+		struct kmem_cache *root_cache = params->root_cache;
+
+		BUG_ON(root_cache->memcg_params->memcg_caches[id] != cachep);
+		root_cache->memcg_params->memcg_caches[id] = NULL;
 
 		kmem_cache_shrink(cachep);
 		if (atomic_read(&cachep->memcg_params->nr_pages) == 0)
 			memcg_unregister_cache(cachep);
 	}
 	mutex_unlock(&memcg_slab_mutex);
+
+	ida_simple_remove(&kmem_limited_groups, id);
 }
 
 struct memcg_register_cache_work {
@@ -3221,6 +3242,7 @@ struct kmem_cache *__memcg_kmem_get_cache(struct kmem_cache *cachep,
 {
 	struct mem_cgroup *memcg;
 	struct kmem_cache *memcg_cachep;
+	int id;
 
 	VM_BUG_ON(!cachep->memcg_params);
 	VM_BUG_ON(!cachep->memcg_params->is_root_cache);
@@ -3234,7 +3256,15 @@ struct kmem_cache *__memcg_kmem_get_cache(struct kmem_cache *cachep,
 	if (!memcg_can_account_kmem(memcg))
 		goto out;
 
-	memcg_cachep = cache_from_memcg_idx(cachep, memcg_cache_id(memcg));
+	id = memcg_cache_id(memcg);
+	/*
+	 * This can happen if current was migrated to another cgroup and this
+	 * cgroup was taken offline after we issued mem_cgroup_from_task above.
+	 */
+	if (unlikely(id < 0))
+		goto out;
+
+	memcg_cachep = cache_from_memcg_idx(cachep, id);
 	if (likely(memcg_cachep)) {
 		cachep = memcg_cachep;
 		goto out;
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH -mm 0/6] memcg: release memcg_cache_id on css offline
  2014-07-21 11:47 [PATCH -mm 0/6] memcg: release memcg_cache_id on css offline Vladimir Davydov
                   ` (5 preceding siblings ...)
  2014-07-21 11:47 ` [PATCH -mm 6/6] memcg: release memcg_cache_id on css offline Vladimir Davydov
@ 2014-07-23 10:53 ` Vladimir Davydov
  6 siblings, 0 replies; 8+ messages in thread
From: Vladimir Davydov @ 2014-07-23 10:53 UTC (permalink / raw)
  To: akpm; +Cc: mhocko, hannes, cl, linux-mm, linux-kernel

On Mon, Jul 21, 2014 at 03:47:10PM +0400, Vladimir Davydov wrote:
> This patch set makes memcg release memcg_cache_id on css offline. This
> way the memcg_caches arrays size will be limited by the number of alive
> kmem-active memory cgroups, which is much better.

Hi Andrew,

While preparing the per-memcg slab shrinkers patch set, I realized that
releasing memcg_cache_id on css offline is incorrect, because after css
offline there still can be elements on per-memcg list_lrus, which are
indexed by memcg_cache_id. We could re-parent them, but this is what we
decided to avoid in order to keep things clean and simple. So it seems
there's nothing we can do except keeping memcg_cache_ids till css free.

I wonder if we could reclaim memory from per memcg arrays (per memcg
list_lrus, kmem_caches) on memory pressure. May be, we could use
flex_array to achieve that.

Anyway, could you please drop the following patches from the mmotm tree
(all this set except patch 1, which is a mere cleanup)?

  memcg-release-memcg_cache_id-on-css-offline
  memcg-keep-all-children-of-each-root-cache-on-a-list
  memcg-add-pointer-to-owner-cache-to-memcg_cache_params
  memcg-make-memcg_cache_id-static
  slab-use-mem_cgroup_id-for-per-memcg-cache-naming

Sorry about the noise.

Thank you.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-07-23 10:53 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-21 11:47 [PATCH -mm 0/6] memcg: release memcg_cache_id on css offline Vladimir Davydov
2014-07-21 11:47 ` [PATCH -mm 1/6] slub: remove kmemcg id from create_unique_id Vladimir Davydov
2014-07-21 11:47 ` [PATCH -mm 2/6] slab: use mem_cgroup_id for per memcg cache naming Vladimir Davydov
2014-07-21 11:47 ` [PATCH -mm 3/6] memcg: make memcg_cache_id static Vladimir Davydov
2014-07-21 11:47 ` [PATCH -mm 4/6] memcg: add pointer to owner cache to memcg_cache_params Vladimir Davydov
2014-07-21 11:47 ` [PATCH -mm 5/6] memcg: keep all children of each root cache on a list Vladimir Davydov
2014-07-21 11:47 ` [PATCH -mm 6/6] memcg: release memcg_cache_id on css offline Vladimir Davydov
2014-07-23 10:53 ` [PATCH -mm 0/6] " Vladimir Davydov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).