All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/8] mm: memcontrol: account "kmem" in cgroup2
@ 2015-12-08 18:34 ` Johannes Weiner
  0 siblings, 0 replies; 79+ messages in thread
From: Johannes Weiner @ 2015-12-08 18:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Michal Hocko, Vladimir Davydov, linux-mm, cgroups, linux-kernel,
	kernel-team

Hi,

this series adds accounting of the historical "kmem" memory consumers
to the cgroup2 memory controller.

These consumers include the dentry cache, the inode cache, kernel
stack pages, and a few others that are pointed out in patch 7/8. The
footprint of these consumers is directly tied to userspace activity in
common workloads, and so they have to be part of the minimally viable
configuration in order to present a complete feature to our users.

The cgroup2 interface of the memory controller is far from complete,
but this series, along with the socket memory accounting series,
provides the final semantic changes for the existing memory knobs in
the cgroup2 interface, which is scheduled for initial release in the
next merge window.

Thanks!

 include/linux/list_lru.h     |   4 +-
 include/linux/memcontrol.h   | 330 +++++++++++++++++++++--------------------
 include/linux/sched.h        |   2 -
 include/linux/slab.h         |   2 +-
 include/linux/slab_def.h     |   3 +-
 include/linux/slub_def.h     |   2 +-
 include/net/tcp_memcontrol.h |   3 +-
 init/Kconfig                 |  10 +-
 mm/list_lru.c                |  12 +-
 mm/memcontrol.c              | 246 ++++++++++++++----------------
 mm/slab.h                    |   6 +-
 mm/slab_common.c             |  14 +-
 mm/slub.c                    |  10 +-
 mm/vmscan.c                  |   2 +-
 net/ipv4/Makefile            |   2 +-
 net/ipv4/tcp_memcontrol.c    |   2 +-
 16 files changed, 319 insertions(+), 331 deletions(-)


^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH 0/8] mm: memcontrol: account "kmem" in cgroup2
@ 2015-12-08 18:34 ` Johannes Weiner
  0 siblings, 0 replies; 79+ messages in thread
From: Johannes Weiner @ 2015-12-08 18:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Michal Hocko, Vladimir Davydov, linux-mm, cgroups, linux-kernel,
	kernel-team

Hi,

this series adds accounting of the historical "kmem" memory consumers
to the cgroup2 memory controller.

These consumers include the dentry cache, the inode cache, kernel
stack pages, and a few others that are pointed out in patch 7/8. The
footprint of these consumers is directly tied to userspace activity in
common workloads, and so they have to be part of the minimally viable
configuration in order to present a complete feature to our users.

The cgroup2 interface of the memory controller is far from complete,
but this series, along with the socket memory accounting series,
provides the final semantic changes for the existing memory knobs in
the cgroup2 interface, which is scheduled for initial release in the
next merge window.

Thanks!

 include/linux/list_lru.h     |   4 +-
 include/linux/memcontrol.h   | 330 +++++++++++++++++++++--------------------
 include/linux/sched.h        |   2 -
 include/linux/slab.h         |   2 +-
 include/linux/slab_def.h     |   3 +-
 include/linux/slub_def.h     |   2 +-
 include/net/tcp_memcontrol.h |   3 +-
 init/Kconfig                 |  10 +-
 mm/list_lru.c                |  12 +-
 mm/memcontrol.c              | 246 ++++++++++++++----------------
 mm/slab.h                    |   6 +-
 mm/slab_common.c             |  14 +-
 mm/slub.c                    |  10 +-
 mm/vmscan.c                  |   2 +-
 net/ipv4/Makefile            |   2 +-
 net/ipv4/tcp_memcontrol.c    |   2 +-
 16 files changed, 319 insertions(+), 331 deletions(-)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH 0/8] mm: memcontrol: account "kmem" in cgroup2
@ 2015-12-08 18:34 ` Johannes Weiner
  0 siblings, 0 replies; 79+ messages in thread
From: Johannes Weiner @ 2015-12-08 18:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Michal Hocko, Vladimir Davydov, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, kernel-team-b10kYP2dOMg

Hi,

this series adds accounting of the historical "kmem" memory consumers
to the cgroup2 memory controller.

These consumers include the dentry cache, the inode cache, kernel
stack pages, and a few others that are pointed out in patch 7/8. The
footprint of these consumers is directly tied to userspace activity in
common workloads, and so they have to be part of the minimally viable
configuration in order to present a complete feature to our users.

The cgroup2 interface of the memory controller is far from complete,
but this series, along with the socket memory accounting series,
provides the final semantic changes for the existing memory knobs in
the cgroup2 interface, which is scheduled for initial release in the
next merge window.

Thanks!

 include/linux/list_lru.h     |   4 +-
 include/linux/memcontrol.h   | 330 +++++++++++++++++++++--------------------
 include/linux/sched.h        |   2 -
 include/linux/slab.h         |   2 +-
 include/linux/slab_def.h     |   3 +-
 include/linux/slub_def.h     |   2 +-
 include/net/tcp_memcontrol.h |   3 +-
 init/Kconfig                 |  10 +-
 mm/list_lru.c                |  12 +-
 mm/memcontrol.c              | 246 ++++++++++++++----------------
 mm/slab.h                    |   6 +-
 mm/slab_common.c             |  14 +-
 mm/slub.c                    |  10 +-
 mm/vmscan.c                  |   2 +-
 net/ipv4/Makefile            |   2 +-
 net/ipv4/tcp_memcontrol.c    |   2 +-
 16 files changed, 319 insertions(+), 331 deletions(-)

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH 1/8] mm: memcontrol: drop unused @css argument in memcg_init_kmem
  2015-12-08 18:34 ` Johannes Weiner
@ 2015-12-08 18:34   ` Johannes Weiner
  -1 siblings, 0 replies; 79+ messages in thread
From: Johannes Weiner @ 2015-12-08 18:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Michal Hocko, Vladimir Davydov, linux-mm, cgroups, linux-kernel,
	kernel-team

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 include/net/tcp_memcontrol.h | 3 ++-
 mm/memcontrol.c              | 6 +++---
 net/ipv4/tcp_memcontrol.c    | 2 +-
 3 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/include/net/tcp_memcontrol.h b/include/net/tcp_memcontrol.h
index 3a17b16..dc2da2f 100644
--- a/include/net/tcp_memcontrol.h
+++ b/include/net/tcp_memcontrol.h
@@ -1,6 +1,7 @@
 #ifndef _TCP_MEMCG_H
 #define _TCP_MEMCG_H
 
-int tcp_init_cgroup(struct mem_cgroup *memcg, struct cgroup_subsys *ss);
+int tcp_init_cgroup(struct mem_cgroup *memcg);
 void tcp_destroy_cgroup(struct mem_cgroup *memcg);
+
 #endif /* _TCP_MEMCG_H */
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 5fe45d68..eda8d43 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3561,7 +3561,7 @@ static int mem_cgroup_oom_control_write(struct cgroup_subsys_state *css,
 }
 
 #ifdef CONFIG_MEMCG_KMEM
-static int memcg_init_kmem(struct mem_cgroup *memcg, struct cgroup_subsys *ss)
+static int memcg_init_kmem(struct mem_cgroup *memcg)
 {
 	int ret;
 
@@ -3569,7 +3569,7 @@ static int memcg_init_kmem(struct mem_cgroup *memcg, struct cgroup_subsys *ss)
 	if (ret)
 		return ret;
 
-	return tcp_init_cgroup(memcg, ss);
+	return tcp_init_cgroup(memcg);
 }
 
 static void memcg_deactivate_kmem(struct mem_cgroup *memcg)
@@ -4252,7 +4252,7 @@ mem_cgroup_css_online(struct cgroup_subsys_state *css)
 	}
 	mutex_unlock(&memcg_create_mutex);
 
-	ret = memcg_init_kmem(memcg, &memory_cgrp_subsys);
+	ret = memcg_init_kmem(memcg);
 	if (ret)
 		return ret;
 
diff --git a/net/ipv4/tcp_memcontrol.c b/net/ipv4/tcp_memcontrol.c
index 18bc7f7..133eb5e 100644
--- a/net/ipv4/tcp_memcontrol.c
+++ b/net/ipv4/tcp_memcontrol.c
@@ -6,7 +6,7 @@
 #include <linux/memcontrol.h>
 #include <linux/module.h>
 
-int tcp_init_cgroup(struct mem_cgroup *memcg, struct cgroup_subsys *ss)
+int tcp_init_cgroup(struct mem_cgroup *memcg)
 {
 	struct mem_cgroup *parent = parent_mem_cgroup(memcg);
 	struct page_counter *counter_parent = NULL;
-- 
2.6.3


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 1/8] mm: memcontrol: drop unused @css argument in memcg_init_kmem
@ 2015-12-08 18:34   ` Johannes Weiner
  0 siblings, 0 replies; 79+ messages in thread
From: Johannes Weiner @ 2015-12-08 18:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Michal Hocko, Vladimir Davydov, linux-mm, cgroups, linux-kernel,
	kernel-team

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 include/net/tcp_memcontrol.h | 3 ++-
 mm/memcontrol.c              | 6 +++---
 net/ipv4/tcp_memcontrol.c    | 2 +-
 3 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/include/net/tcp_memcontrol.h b/include/net/tcp_memcontrol.h
index 3a17b16..dc2da2f 100644
--- a/include/net/tcp_memcontrol.h
+++ b/include/net/tcp_memcontrol.h
@@ -1,6 +1,7 @@
 #ifndef _TCP_MEMCG_H
 #define _TCP_MEMCG_H
 
-int tcp_init_cgroup(struct mem_cgroup *memcg, struct cgroup_subsys *ss);
+int tcp_init_cgroup(struct mem_cgroup *memcg);
 void tcp_destroy_cgroup(struct mem_cgroup *memcg);
+
 #endif /* _TCP_MEMCG_H */
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 5fe45d68..eda8d43 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3561,7 +3561,7 @@ static int mem_cgroup_oom_control_write(struct cgroup_subsys_state *css,
 }
 
 #ifdef CONFIG_MEMCG_KMEM
-static int memcg_init_kmem(struct mem_cgroup *memcg, struct cgroup_subsys *ss)
+static int memcg_init_kmem(struct mem_cgroup *memcg)
 {
 	int ret;
 
@@ -3569,7 +3569,7 @@ static int memcg_init_kmem(struct mem_cgroup *memcg, struct cgroup_subsys *ss)
 	if (ret)
 		return ret;
 
-	return tcp_init_cgroup(memcg, ss);
+	return tcp_init_cgroup(memcg);
 }
 
 static void memcg_deactivate_kmem(struct mem_cgroup *memcg)
@@ -4252,7 +4252,7 @@ mem_cgroup_css_online(struct cgroup_subsys_state *css)
 	}
 	mutex_unlock(&memcg_create_mutex);
 
-	ret = memcg_init_kmem(memcg, &memory_cgrp_subsys);
+	ret = memcg_init_kmem(memcg);
 	if (ret)
 		return ret;
 
diff --git a/net/ipv4/tcp_memcontrol.c b/net/ipv4/tcp_memcontrol.c
index 18bc7f7..133eb5e 100644
--- a/net/ipv4/tcp_memcontrol.c
+++ b/net/ipv4/tcp_memcontrol.c
@@ -6,7 +6,7 @@
 #include <linux/memcontrol.h>
 #include <linux/module.h>
 
-int tcp_init_cgroup(struct mem_cgroup *memcg, struct cgroup_subsys *ss)
+int tcp_init_cgroup(struct mem_cgroup *memcg)
 {
 	struct mem_cgroup *parent = parent_mem_cgroup(memcg);
 	struct page_counter *counter_parent = NULL;
-- 
2.6.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 2/8] mm: memcontrol: remove double kmem page_counter init
  2015-12-08 18:34 ` Johannes Weiner
@ 2015-12-08 18:34   ` Johannes Weiner
  -1 siblings, 0 replies; 79+ messages in thread
From: Johannes Weiner @ 2015-12-08 18:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Michal Hocko, Vladimir Davydov, linux-mm, cgroups, linux-kernel,
	kernel-team

The kmem page_counter's limit is initialized to PAGE_COUNTER_MAX
inside mem_cgroup_css_online(). There is no need to repeat this
from memcg_propagate_kmem().

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 mm/memcontrol.c | 24 ++++++++++--------------
 1 file changed, 10 insertions(+), 14 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index eda8d43..02167db 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2840,8 +2840,7 @@ static u64 mem_cgroup_read_u64(struct cgroup_subsys_state *css,
 }
 
 #ifdef CONFIG_MEMCG_KMEM
-static int memcg_activate_kmem(struct mem_cgroup *memcg,
-			       unsigned long nr_pages)
+static int memcg_activate_kmem(struct mem_cgroup *memcg)
 {
 	int err = 0;
 	int memcg_id;
@@ -2876,13 +2875,6 @@ static int memcg_activate_kmem(struct mem_cgroup *memcg,
 		goto out;
 	}
 
-	/*
-	 * We couldn't have accounted to this cgroup, because it hasn't got
-	 * activated yet, so this should succeed.
-	 */
-	err = page_counter_limit(&memcg->kmem, nr_pages);
-	VM_BUG_ON(err);
-
 	static_branch_inc(&memcg_kmem_enabled_key);
 	/*
 	 * A memory cgroup is considered kmem-active as soon as it gets
@@ -2903,10 +2895,14 @@ static int memcg_update_kmem_limit(struct mem_cgroup *memcg,
 	int ret;
 
 	mutex_lock(&memcg_limit_mutex);
-	if (!memcg_kmem_is_active(memcg))
-		ret = memcg_activate_kmem(memcg, limit);
-	else
-		ret = page_counter_limit(&memcg->kmem, limit);
+	/* Top-level cgroup doesn't propagate from root */
+	if (!memcg_kmem_is_active(memcg)) {
+		ret = memcg_activate_kmem(memcg);
+		if (ret)
+			goto out;
+	}
+	ret = page_counter_limit(&memcg->kmem, limit);
+out:
 	mutex_unlock(&memcg_limit_mutex);
 	return ret;
 }
@@ -2925,7 +2921,7 @@ static int memcg_propagate_kmem(struct mem_cgroup *memcg)
 	 * after this point, because it has at least one child already.
 	 */
 	if (memcg_kmem_is_active(parent))
-		ret = memcg_activate_kmem(memcg, PAGE_COUNTER_MAX);
+		ret = memcg_activate_kmem(memcg);
 	mutex_unlock(&memcg_limit_mutex);
 	return ret;
 }
-- 
2.6.3


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 2/8] mm: memcontrol: remove double kmem page_counter init
@ 2015-12-08 18:34   ` Johannes Weiner
  0 siblings, 0 replies; 79+ messages in thread
From: Johannes Weiner @ 2015-12-08 18:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Michal Hocko, Vladimir Davydov, linux-mm, cgroups, linux-kernel,
	kernel-team

The kmem page_counter's limit is initialized to PAGE_COUNTER_MAX
inside mem_cgroup_css_online(). There is no need to repeat this
from memcg_propagate_kmem().

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 mm/memcontrol.c | 24 ++++++++++--------------
 1 file changed, 10 insertions(+), 14 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index eda8d43..02167db 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2840,8 +2840,7 @@ static u64 mem_cgroup_read_u64(struct cgroup_subsys_state *css,
 }
 
 #ifdef CONFIG_MEMCG_KMEM
-static int memcg_activate_kmem(struct mem_cgroup *memcg,
-			       unsigned long nr_pages)
+static int memcg_activate_kmem(struct mem_cgroup *memcg)
 {
 	int err = 0;
 	int memcg_id;
@@ -2876,13 +2875,6 @@ static int memcg_activate_kmem(struct mem_cgroup *memcg,
 		goto out;
 	}
 
-	/*
-	 * We couldn't have accounted to this cgroup, because it hasn't got
-	 * activated yet, so this should succeed.
-	 */
-	err = page_counter_limit(&memcg->kmem, nr_pages);
-	VM_BUG_ON(err);
-
 	static_branch_inc(&memcg_kmem_enabled_key);
 	/*
 	 * A memory cgroup is considered kmem-active as soon as it gets
@@ -2903,10 +2895,14 @@ static int memcg_update_kmem_limit(struct mem_cgroup *memcg,
 	int ret;
 
 	mutex_lock(&memcg_limit_mutex);
-	if (!memcg_kmem_is_active(memcg))
-		ret = memcg_activate_kmem(memcg, limit);
-	else
-		ret = page_counter_limit(&memcg->kmem, limit);
+	/* Top-level cgroup doesn't propagate from root */
+	if (!memcg_kmem_is_active(memcg)) {
+		ret = memcg_activate_kmem(memcg);
+		if (ret)
+			goto out;
+	}
+	ret = page_counter_limit(&memcg->kmem, limit);
+out:
 	mutex_unlock(&memcg_limit_mutex);
 	return ret;
 }
@@ -2925,7 +2921,7 @@ static int memcg_propagate_kmem(struct mem_cgroup *memcg)
 	 * after this point, because it has at least one child already.
 	 */
 	if (memcg_kmem_is_active(parent))
-		ret = memcg_activate_kmem(memcg, PAGE_COUNTER_MAX);
+		ret = memcg_activate_kmem(memcg);
 	mutex_unlock(&memcg_limit_mutex);
 	return ret;
 }
-- 
2.6.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 3/8] mm: memcontrol: give the kmem states more descriptive names
  2015-12-08 18:34 ` Johannes Weiner
@ 2015-12-08 18:34   ` Johannes Weiner
  -1 siblings, 0 replies; 79+ messages in thread
From: Johannes Weiner @ 2015-12-08 18:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Michal Hocko, Vladimir Davydov, linux-mm, cgroups, linux-kernel,
	kernel-team

On any given memcg, the kmem accounting feature has three separate
states: not initialized, structures allocated, and actively accounting
slab memory. These are represented through a combination of the
kmem_acct_activated and kmem_acct_active flags, which is confusing.

Convert to a kmem_state enum with the states NONE, ALLOCATED, and
ONLINE. Then rename the functions to modify the state accordingly.
This follows the nomenclature of css object states more closely.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 include/linux/memcontrol.h | 15 ++++++++-----
 mm/memcontrol.c            | 52 ++++++++++++++++++++++------------------------
 mm/slab_common.c           |  4 ++--
 mm/vmscan.c                |  2 +-
 4 files changed, 38 insertions(+), 35 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 189f04d..54dab4d 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -152,6 +152,12 @@ struct mem_cgroup_thresholds {
 	struct mem_cgroup_threshold_ary *spare;
 };
 
+enum memcg_kmem_state {
+	KMEM_NONE,
+	KMEM_ALLOCATED,
+	KMEM_ONLINE,
+};
+
 /*
  * The memory controller data structure. The memory controller controls both
  * page cache and RSS per cgroup. We would eventually like to provide
@@ -233,8 +239,7 @@ struct mem_cgroup {
 #if defined(CONFIG_MEMCG_KMEM)
         /* Index in the kmem_cache->memcg_params.memcg_caches array */
 	int kmemcg_id;
-	bool kmem_acct_activated;
-	bool kmem_acct_active;
+	enum memcg_kmem_state kmem_state;
 #endif
 
 	int last_scanned_node;
@@ -750,9 +755,9 @@ static inline bool memcg_kmem_enabled(void)
 	return static_branch_unlikely(&memcg_kmem_enabled_key);
 }
 
-static inline bool memcg_kmem_is_active(struct mem_cgroup *memcg)
+static inline bool memcg_kmem_online(struct mem_cgroup *memcg)
 {
-	return memcg->kmem_acct_active;
+	return memcg->kmem_state == KMEM_ONLINE;
 }
 
 /*
@@ -850,7 +855,7 @@ static inline bool memcg_kmem_enabled(void)
 	return false;
 }
 
-static inline bool memcg_kmem_is_active(struct mem_cgroup *memcg)
+static inline bool memcg_kmem_online(struct mem_cgroup *memcg)
 {
 	return false;
 }
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 02167db..22b8c4f 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2357,7 +2357,7 @@ int __memcg_kmem_charge_memcg(struct page *page, gfp_t gfp, int order,
 	struct page_counter *counter;
 	int ret;
 
-	if (!memcg_kmem_is_active(memcg))
+	if (!memcg_kmem_online(memcg))
 		return 0;
 
 	if (!page_counter_try_charge(&memcg->kmem, nr_pages, &counter))
@@ -2840,14 +2840,13 @@ static u64 mem_cgroup_read_u64(struct cgroup_subsys_state *css,
 }
 
 #ifdef CONFIG_MEMCG_KMEM
-static int memcg_activate_kmem(struct mem_cgroup *memcg)
+static int memcg_online_kmem(struct mem_cgroup *memcg)
 {
 	int err = 0;
 	int memcg_id;
 
 	BUG_ON(memcg->kmemcg_id >= 0);
-	BUG_ON(memcg->kmem_acct_activated);
-	BUG_ON(memcg->kmem_acct_active);
+	BUG_ON(memcg->kmem_state);
 
 	/*
 	 * For simplicity, we won't allow this to be disabled.  It also can't
@@ -2877,14 +2876,13 @@ static int memcg_activate_kmem(struct mem_cgroup *memcg)
 
 	static_branch_inc(&memcg_kmem_enabled_key);
 	/*
-	 * A memory cgroup is considered kmem-active as soon as it gets
+	 * A memory cgroup is considered kmem-online as soon as it gets
 	 * kmemcg_id. Setting the id after enabling static branching will
 	 * guarantee no one starts accounting before all call sites are
 	 * patched.
 	 */
 	memcg->kmemcg_id = memcg_id;
-	memcg->kmem_acct_activated = true;
-	memcg->kmem_acct_active = true;
+	memcg->kmem_state = KMEM_ONLINE;
 out:
 	return err;
 }
@@ -2896,8 +2894,8 @@ static int memcg_update_kmem_limit(struct mem_cgroup *memcg,
 
 	mutex_lock(&memcg_limit_mutex);
 	/* Top-level cgroup doesn't propagate from root */
-	if (!memcg_kmem_is_active(memcg)) {
-		ret = memcg_activate_kmem(memcg);
+	if (!memcg_kmem_online(memcg)) {
+		ret = memcg_online_kmem(memcg);
 		if (ret)
 			goto out;
 	}
@@ -2917,11 +2915,12 @@ static int memcg_propagate_kmem(struct mem_cgroup *memcg)
 
 	mutex_lock(&memcg_limit_mutex);
 	/*
-	 * If the parent cgroup is not kmem-active now, it cannot be activated
-	 * after this point, because it has at least one child already.
+	 * If the parent cgroup is not kmem-online now, it cannot be
+	 * onlined after this point, because it has at least one child
+	 * already.
 	 */
-	if (memcg_kmem_is_active(parent))
-		ret = memcg_activate_kmem(memcg);
+	if (memcg_kmem_online(parent))
+		ret = memcg_online_kmem(memcg);
 	mutex_unlock(&memcg_limit_mutex);
 	return ret;
 }
@@ -3568,22 +3567,21 @@ static int memcg_init_kmem(struct mem_cgroup *memcg)
 	return tcp_init_cgroup(memcg);
 }
 
-static void memcg_deactivate_kmem(struct mem_cgroup *memcg)
+static void memcg_offline_kmem(struct mem_cgroup *memcg)
 {
 	struct cgroup_subsys_state *css;
 	struct mem_cgroup *parent, *child;
 	int kmemcg_id;
 
-	if (!memcg->kmem_acct_active)
+	if (memcg->kmem_state != KMEM_ONLINE)
 		return;
-
 	/*
-	 * Clear the 'active' flag before clearing memcg_caches arrays entries.
-	 * Since we take the slab_mutex in memcg_deactivate_kmem_caches(), it
-	 * guarantees no cache will be created for this cgroup after we are
-	 * done (see memcg_create_kmem_cache()).
+	 * Clear the online state before clearing memcg_caches array
+	 * entries. The slab_mutex in memcg_deactivate_kmem_caches()
+	 * guarantees that no cache will be created for this cgroup
+	 * after we are done (see memcg_create_kmem_cache()).
 	 */
-	memcg->kmem_acct_active = false;
+	memcg->kmem_state = KMEM_ALLOCATED;
 
 	memcg_deactivate_kmem_caches(memcg);
 
@@ -3614,9 +3612,9 @@ static void memcg_deactivate_kmem(struct mem_cgroup *memcg)
 	memcg_free_cache_id(kmemcg_id);
 }
 
-static void memcg_destroy_kmem(struct mem_cgroup *memcg)
+static void memcg_free_kmem(struct mem_cgroup *memcg)
 {
-	if (memcg->kmem_acct_activated) {
+	if (memcg->kmem_state == KMEM_ALLOCATED) {
 		memcg_destroy_kmem_caches(memcg);
 		static_branch_dec(&memcg_kmem_enabled_key);
 		WARN_ON(page_counter_read(&memcg->kmem));
@@ -3629,11 +3627,11 @@ static int memcg_init_kmem(struct mem_cgroup *memcg, struct cgroup_subsys *ss)
 	return 0;
 }
 
-static void memcg_deactivate_kmem(struct mem_cgroup *memcg)
+static void memcg_offline_kmem(struct mem_cgroup *memcg)
 {
 }
 
-static void memcg_destroy_kmem(struct mem_cgroup *memcg)
+static void memcg_free_kmem(struct mem_cgroup *memcg)
 {
 }
 #endif
@@ -4286,7 +4284,7 @@ static void mem_cgroup_css_offline(struct cgroup_subsys_state *css)
 
 	vmpressure_cleanup(&memcg->vmpressure);
 
-	memcg_deactivate_kmem(memcg);
+	memcg_offline_kmem(memcg);
 
 	wb_memcg_offline(memcg);
 }
@@ -4295,7 +4293,7 @@ static void mem_cgroup_css_free(struct cgroup_subsys_state *css)
 {
 	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
 
-	memcg_destroy_kmem(memcg);
+	memcg_free_kmem(memcg);
 #ifdef CONFIG_INET
 	if (cgroup_subsys_on_dfl(memory_cgrp_subsys) && !cgroup_memory_nosocket)
 		static_branch_dec(&memcg_sockets_enabled_key);
diff --git a/mm/slab_common.c b/mm/slab_common.c
index e016178..8c262e6 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -503,10 +503,10 @@ void memcg_create_kmem_cache(struct mem_cgroup *memcg,
 	mutex_lock(&slab_mutex);
 
 	/*
-	 * The memory cgroup could have been deactivated while the cache
+	 * The memory cgroup could have been offlined while the cache
 	 * creation work was pending.
 	 */
-	if (!memcg_kmem_is_active(memcg))
+	if (!memcg_kmem_online(memcg))
 		goto out_unlock;
 
 	idx = memcg_cache_id(memcg);
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 50e54c0..2dbc679 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -411,7 +411,7 @@ static unsigned long shrink_slab(gfp_t gfp_mask, int nid,
 	struct shrinker *shrinker;
 	unsigned long freed = 0;
 
-	if (memcg && !memcg_kmem_is_active(memcg))
+	if (memcg && !memcg_kmem_online(memcg))
 		return 0;
 
 	if (nr_scanned == 0)
-- 
2.6.3


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 3/8] mm: memcontrol: give the kmem states more descriptive names
@ 2015-12-08 18:34   ` Johannes Weiner
  0 siblings, 0 replies; 79+ messages in thread
From: Johannes Weiner @ 2015-12-08 18:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Michal Hocko, Vladimir Davydov, linux-mm, cgroups, linux-kernel,
	kernel-team

On any given memcg, the kmem accounting feature has three separate
states: not initialized, structures allocated, and actively accounting
slab memory. These are represented through a combination of the
kmem_acct_activated and kmem_acct_active flags, which is confusing.

Convert to a kmem_state enum with the states NONE, ALLOCATED, and
ONLINE. Then rename the functions to modify the state accordingly.
This follows the nomenclature of css object states more closely.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 include/linux/memcontrol.h | 15 ++++++++-----
 mm/memcontrol.c            | 52 ++++++++++++++++++++++------------------------
 mm/slab_common.c           |  4 ++--
 mm/vmscan.c                |  2 +-
 4 files changed, 38 insertions(+), 35 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 189f04d..54dab4d 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -152,6 +152,12 @@ struct mem_cgroup_thresholds {
 	struct mem_cgroup_threshold_ary *spare;
 };
 
+enum memcg_kmem_state {
+	KMEM_NONE,
+	KMEM_ALLOCATED,
+	KMEM_ONLINE,
+};
+
 /*
  * The memory controller data structure. The memory controller controls both
  * page cache and RSS per cgroup. We would eventually like to provide
@@ -233,8 +239,7 @@ struct mem_cgroup {
 #if defined(CONFIG_MEMCG_KMEM)
         /* Index in the kmem_cache->memcg_params.memcg_caches array */
 	int kmemcg_id;
-	bool kmem_acct_activated;
-	bool kmem_acct_active;
+	enum memcg_kmem_state kmem_state;
 #endif
 
 	int last_scanned_node;
@@ -750,9 +755,9 @@ static inline bool memcg_kmem_enabled(void)
 	return static_branch_unlikely(&memcg_kmem_enabled_key);
 }
 
-static inline bool memcg_kmem_is_active(struct mem_cgroup *memcg)
+static inline bool memcg_kmem_online(struct mem_cgroup *memcg)
 {
-	return memcg->kmem_acct_active;
+	return memcg->kmem_state == KMEM_ONLINE;
 }
 
 /*
@@ -850,7 +855,7 @@ static inline bool memcg_kmem_enabled(void)
 	return false;
 }
 
-static inline bool memcg_kmem_is_active(struct mem_cgroup *memcg)
+static inline bool memcg_kmem_online(struct mem_cgroup *memcg)
 {
 	return false;
 }
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 02167db..22b8c4f 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2357,7 +2357,7 @@ int __memcg_kmem_charge_memcg(struct page *page, gfp_t gfp, int order,
 	struct page_counter *counter;
 	int ret;
 
-	if (!memcg_kmem_is_active(memcg))
+	if (!memcg_kmem_online(memcg))
 		return 0;
 
 	if (!page_counter_try_charge(&memcg->kmem, nr_pages, &counter))
@@ -2840,14 +2840,13 @@ static u64 mem_cgroup_read_u64(struct cgroup_subsys_state *css,
 }
 
 #ifdef CONFIG_MEMCG_KMEM
-static int memcg_activate_kmem(struct mem_cgroup *memcg)
+static int memcg_online_kmem(struct mem_cgroup *memcg)
 {
 	int err = 0;
 	int memcg_id;
 
 	BUG_ON(memcg->kmemcg_id >= 0);
-	BUG_ON(memcg->kmem_acct_activated);
-	BUG_ON(memcg->kmem_acct_active);
+	BUG_ON(memcg->kmem_state);
 
 	/*
 	 * For simplicity, we won't allow this to be disabled.  It also can't
@@ -2877,14 +2876,13 @@ static int memcg_activate_kmem(struct mem_cgroup *memcg)
 
 	static_branch_inc(&memcg_kmem_enabled_key);
 	/*
-	 * A memory cgroup is considered kmem-active as soon as it gets
+	 * A memory cgroup is considered kmem-online as soon as it gets
 	 * kmemcg_id. Setting the id after enabling static branching will
 	 * guarantee no one starts accounting before all call sites are
 	 * patched.
 	 */
 	memcg->kmemcg_id = memcg_id;
-	memcg->kmem_acct_activated = true;
-	memcg->kmem_acct_active = true;
+	memcg->kmem_state = KMEM_ONLINE;
 out:
 	return err;
 }
@@ -2896,8 +2894,8 @@ static int memcg_update_kmem_limit(struct mem_cgroup *memcg,
 
 	mutex_lock(&memcg_limit_mutex);
 	/* Top-level cgroup doesn't propagate from root */
-	if (!memcg_kmem_is_active(memcg)) {
-		ret = memcg_activate_kmem(memcg);
+	if (!memcg_kmem_online(memcg)) {
+		ret = memcg_online_kmem(memcg);
 		if (ret)
 			goto out;
 	}
@@ -2917,11 +2915,12 @@ static int memcg_propagate_kmem(struct mem_cgroup *memcg)
 
 	mutex_lock(&memcg_limit_mutex);
 	/*
-	 * If the parent cgroup is not kmem-active now, it cannot be activated
-	 * after this point, because it has at least one child already.
+	 * If the parent cgroup is not kmem-online now, it cannot be
+	 * onlined after this point, because it has at least one child
+	 * already.
 	 */
-	if (memcg_kmem_is_active(parent))
-		ret = memcg_activate_kmem(memcg);
+	if (memcg_kmem_online(parent))
+		ret = memcg_online_kmem(memcg);
 	mutex_unlock(&memcg_limit_mutex);
 	return ret;
 }
@@ -3568,22 +3567,21 @@ static int memcg_init_kmem(struct mem_cgroup *memcg)
 	return tcp_init_cgroup(memcg);
 }
 
-static void memcg_deactivate_kmem(struct mem_cgroup *memcg)
+static void memcg_offline_kmem(struct mem_cgroup *memcg)
 {
 	struct cgroup_subsys_state *css;
 	struct mem_cgroup *parent, *child;
 	int kmemcg_id;
 
-	if (!memcg->kmem_acct_active)
+	if (memcg->kmem_state != KMEM_ONLINE)
 		return;
-
 	/*
-	 * Clear the 'active' flag before clearing memcg_caches arrays entries.
-	 * Since we take the slab_mutex in memcg_deactivate_kmem_caches(), it
-	 * guarantees no cache will be created for this cgroup after we are
-	 * done (see memcg_create_kmem_cache()).
+	 * Clear the online state before clearing memcg_caches array
+	 * entries. The slab_mutex in memcg_deactivate_kmem_caches()
+	 * guarantees that no cache will be created for this cgroup
+	 * after we are done (see memcg_create_kmem_cache()).
 	 */
-	memcg->kmem_acct_active = false;
+	memcg->kmem_state = KMEM_ALLOCATED;
 
 	memcg_deactivate_kmem_caches(memcg);
 
@@ -3614,9 +3612,9 @@ static void memcg_deactivate_kmem(struct mem_cgroup *memcg)
 	memcg_free_cache_id(kmemcg_id);
 }
 
-static void memcg_destroy_kmem(struct mem_cgroup *memcg)
+static void memcg_free_kmem(struct mem_cgroup *memcg)
 {
-	if (memcg->kmem_acct_activated) {
+	if (memcg->kmem_state == KMEM_ALLOCATED) {
 		memcg_destroy_kmem_caches(memcg);
 		static_branch_dec(&memcg_kmem_enabled_key);
 		WARN_ON(page_counter_read(&memcg->kmem));
@@ -3629,11 +3627,11 @@ static int memcg_init_kmem(struct mem_cgroup *memcg, struct cgroup_subsys *ss)
 	return 0;
 }
 
-static void memcg_deactivate_kmem(struct mem_cgroup *memcg)
+static void memcg_offline_kmem(struct mem_cgroup *memcg)
 {
 }
 
-static void memcg_destroy_kmem(struct mem_cgroup *memcg)
+static void memcg_free_kmem(struct mem_cgroup *memcg)
 {
 }
 #endif
@@ -4286,7 +4284,7 @@ static void mem_cgroup_css_offline(struct cgroup_subsys_state *css)
 
 	vmpressure_cleanup(&memcg->vmpressure);
 
-	memcg_deactivate_kmem(memcg);
+	memcg_offline_kmem(memcg);
 
 	wb_memcg_offline(memcg);
 }
@@ -4295,7 +4293,7 @@ static void mem_cgroup_css_free(struct cgroup_subsys_state *css)
 {
 	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
 
-	memcg_destroy_kmem(memcg);
+	memcg_free_kmem(memcg);
 #ifdef CONFIG_INET
 	if (cgroup_subsys_on_dfl(memory_cgrp_subsys) && !cgroup_memory_nosocket)
 		static_branch_dec(&memcg_sockets_enabled_key);
diff --git a/mm/slab_common.c b/mm/slab_common.c
index e016178..8c262e6 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -503,10 +503,10 @@ void memcg_create_kmem_cache(struct mem_cgroup *memcg,
 	mutex_lock(&slab_mutex);
 
 	/*
-	 * The memory cgroup could have been deactivated while the cache
+	 * The memory cgroup could have been offlined while the cache
 	 * creation work was pending.
 	 */
-	if (!memcg_kmem_is_active(memcg))
+	if (!memcg_kmem_online(memcg))
 		goto out_unlock;
 
 	idx = memcg_cache_id(memcg);
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 50e54c0..2dbc679 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -411,7 +411,7 @@ static unsigned long shrink_slab(gfp_t gfp_mask, int nid,
 	struct shrinker *shrinker;
 	unsigned long freed = 0;
 
-	if (memcg && !memcg_kmem_is_active(memcg))
+	if (memcg && !memcg_kmem_online(memcg))
 		return 0;
 
 	if (nr_scanned == 0)
-- 
2.6.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 4/8] mm: memcontrol: group kmem init and exit functions together
  2015-12-08 18:34 ` Johannes Weiner
@ 2015-12-08 18:34   ` Johannes Weiner
  -1 siblings, 0 replies; 79+ messages in thread
From: Johannes Weiner @ 2015-12-08 18:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Michal Hocko, Vladimir Davydov, linux-mm, cgroups, linux-kernel,
	kernel-team

Put all the related code to setup and teardown the kmem accounting
state into the same location. No functional change intended.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 mm/memcontrol.c | 157 +++++++++++++++++++++++++++-----------------------------
 1 file changed, 76 insertions(+), 81 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 22b8c4f..5118618 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2924,12 +2924,88 @@ static int memcg_propagate_kmem(struct mem_cgroup *memcg)
 	mutex_unlock(&memcg_limit_mutex);
 	return ret;
 }
+
+static int memcg_init_kmem(struct mem_cgroup *memcg)
+{
+	int ret;
+
+	ret = memcg_propagate_kmem(memcg);
+	if (ret)
+		return ret;
+
+	return tcp_init_cgroup(memcg);
+}
+
+static void memcg_offline_kmem(struct mem_cgroup *memcg)
+{
+	struct cgroup_subsys_state *css;
+	struct mem_cgroup *parent, *child;
+	int kmemcg_id;
+
+	if (memcg->kmem_state != KMEM_ONLINE)
+		return;
+	/*
+	 * Clear the online state before clearing memcg_caches array
+	 * entries. The slab_mutex in memcg_deactivate_kmem_caches()
+	 * guarantees that no cache will be created for this cgroup
+	 * after we are done (see memcg_create_kmem_cache()).
+	 */
+	memcg->kmem_state = KMEM_ALLOCATED;
+
+	memcg_deactivate_kmem_caches(memcg);
+
+	kmemcg_id = memcg->kmemcg_id;
+	BUG_ON(kmemcg_id < 0);
+
+	parent = parent_mem_cgroup(memcg);
+	if (!parent)
+		parent = root_mem_cgroup;
+
+	/*
+	 * Change kmemcg_id of this cgroup and all its descendants to the
+	 * parent's id, and then move all entries from this cgroup's list_lrus
+	 * to ones of the parent. After we have finished, all list_lrus
+	 * corresponding to this cgroup are guaranteed to remain empty. The
+	 * ordering is imposed by list_lru_node->lock taken by
+	 * memcg_drain_all_list_lrus().
+	 */
+	css_for_each_descendant_pre(css, &memcg->css) {
+		child = mem_cgroup_from_css(css);
+		BUG_ON(child->kmemcg_id != kmemcg_id);
+		child->kmemcg_id = parent->kmemcg_id;
+		if (!memcg->use_hierarchy)
+			break;
+	}
+	memcg_drain_all_list_lrus(kmemcg_id, parent->kmemcg_id);
+
+	memcg_free_cache_id(kmemcg_id);
+}
+
+static void memcg_free_kmem(struct mem_cgroup *memcg)
+{
+	if (memcg->kmem_state == KMEM_ALLOCATED) {
+		memcg_destroy_kmem_caches(memcg);
+		static_branch_dec(&memcg_kmem_enabled_key);
+		WARN_ON(page_counter_read(&memcg->kmem));
+	}
+	tcp_destroy_cgroup(memcg);
+}
 #else
 static int memcg_update_kmem_limit(struct mem_cgroup *memcg,
 				   unsigned long limit)
 {
 	return -EINVAL;
 }
+static int memcg_init_kmem(struct mem_cgroup *memcg)
+{
+	return 0;
+}
+static void memcg_offline_kmem(struct mem_cgroup *memcg)
+{
+}
+static void memcg_free_kmem(struct mem_cgroup *memcg)
+{
+}
 #endif /* CONFIG_MEMCG_KMEM */
 
 /*
@@ -3555,87 +3631,6 @@ static int mem_cgroup_oom_control_write(struct cgroup_subsys_state *css,
 	return 0;
 }
 
-#ifdef CONFIG_MEMCG_KMEM
-static int memcg_init_kmem(struct mem_cgroup *memcg)
-{
-	int ret;
-
-	ret = memcg_propagate_kmem(memcg);
-	if (ret)
-		return ret;
-
-	return tcp_init_cgroup(memcg);
-}
-
-static void memcg_offline_kmem(struct mem_cgroup *memcg)
-{
-	struct cgroup_subsys_state *css;
-	struct mem_cgroup *parent, *child;
-	int kmemcg_id;
-
-	if (memcg->kmem_state != KMEM_ONLINE)
-		return;
-	/*
-	 * Clear the online state before clearing memcg_caches array
-	 * entries. The slab_mutex in memcg_deactivate_kmem_caches()
-	 * guarantees that no cache will be created for this cgroup
-	 * after we are done (see memcg_create_kmem_cache()).
-	 */
-	memcg->kmem_state = KMEM_ALLOCATED;
-
-	memcg_deactivate_kmem_caches(memcg);
-
-	kmemcg_id = memcg->kmemcg_id;
-	BUG_ON(kmemcg_id < 0);
-
-	parent = parent_mem_cgroup(memcg);
-	if (!parent)
-		parent = root_mem_cgroup;
-
-	/*
-	 * Change kmemcg_id of this cgroup and all its descendants to the
-	 * parent's id, and then move all entries from this cgroup's list_lrus
-	 * to ones of the parent. After we have finished, all list_lrus
-	 * corresponding to this cgroup are guaranteed to remain empty. The
-	 * ordering is imposed by list_lru_node->lock taken by
-	 * memcg_drain_all_list_lrus().
-	 */
-	css_for_each_descendant_pre(css, &memcg->css) {
-		child = mem_cgroup_from_css(css);
-		BUG_ON(child->kmemcg_id != kmemcg_id);
-		child->kmemcg_id = parent->kmemcg_id;
-		if (!memcg->use_hierarchy)
-			break;
-	}
-	memcg_drain_all_list_lrus(kmemcg_id, parent->kmemcg_id);
-
-	memcg_free_cache_id(kmemcg_id);
-}
-
-static void memcg_free_kmem(struct mem_cgroup *memcg)
-{
-	if (memcg->kmem_state == KMEM_ALLOCATED) {
-		memcg_destroy_kmem_caches(memcg);
-		static_branch_dec(&memcg_kmem_enabled_key);
-		WARN_ON(page_counter_read(&memcg->kmem));
-	}
-	tcp_destroy_cgroup(memcg);
-}
-#else
-static int memcg_init_kmem(struct mem_cgroup *memcg, struct cgroup_subsys *ss)
-{
-	return 0;
-}
-
-static void memcg_offline_kmem(struct mem_cgroup *memcg)
-{
-}
-
-static void memcg_free_kmem(struct mem_cgroup *memcg)
-{
-}
-#endif
-
 #ifdef CONFIG_CGROUP_WRITEBACK
 
 struct list_head *mem_cgroup_cgwb_list(struct mem_cgroup *memcg)
-- 
2.6.3


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 4/8] mm: memcontrol: group kmem init and exit functions together
@ 2015-12-08 18:34   ` Johannes Weiner
  0 siblings, 0 replies; 79+ messages in thread
From: Johannes Weiner @ 2015-12-08 18:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Michal Hocko, Vladimir Davydov, linux-mm, cgroups, linux-kernel,
	kernel-team

Put all the related code to setup and teardown the kmem accounting
state into the same location. No functional change intended.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 mm/memcontrol.c | 157 +++++++++++++++++++++++++++-----------------------------
 1 file changed, 76 insertions(+), 81 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 22b8c4f..5118618 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2924,12 +2924,88 @@ static int memcg_propagate_kmem(struct mem_cgroup *memcg)
 	mutex_unlock(&memcg_limit_mutex);
 	return ret;
 }
+
+static int memcg_init_kmem(struct mem_cgroup *memcg)
+{
+	int ret;
+
+	ret = memcg_propagate_kmem(memcg);
+	if (ret)
+		return ret;
+
+	return tcp_init_cgroup(memcg);
+}
+
+static void memcg_offline_kmem(struct mem_cgroup *memcg)
+{
+	struct cgroup_subsys_state *css;
+	struct mem_cgroup *parent, *child;
+	int kmemcg_id;
+
+	if (memcg->kmem_state != KMEM_ONLINE)
+		return;
+	/*
+	 * Clear the online state before clearing memcg_caches array
+	 * entries. The slab_mutex in memcg_deactivate_kmem_caches()
+	 * guarantees that no cache will be created for this cgroup
+	 * after we are done (see memcg_create_kmem_cache()).
+	 */
+	memcg->kmem_state = KMEM_ALLOCATED;
+
+	memcg_deactivate_kmem_caches(memcg);
+
+	kmemcg_id = memcg->kmemcg_id;
+	BUG_ON(kmemcg_id < 0);
+
+	parent = parent_mem_cgroup(memcg);
+	if (!parent)
+		parent = root_mem_cgroup;
+
+	/*
+	 * Change kmemcg_id of this cgroup and all its descendants to the
+	 * parent's id, and then move all entries from this cgroup's list_lrus
+	 * to ones of the parent. After we have finished, all list_lrus
+	 * corresponding to this cgroup are guaranteed to remain empty. The
+	 * ordering is imposed by list_lru_node->lock taken by
+	 * memcg_drain_all_list_lrus().
+	 */
+	css_for_each_descendant_pre(css, &memcg->css) {
+		child = mem_cgroup_from_css(css);
+		BUG_ON(child->kmemcg_id != kmemcg_id);
+		child->kmemcg_id = parent->kmemcg_id;
+		if (!memcg->use_hierarchy)
+			break;
+	}
+	memcg_drain_all_list_lrus(kmemcg_id, parent->kmemcg_id);
+
+	memcg_free_cache_id(kmemcg_id);
+}
+
+static void memcg_free_kmem(struct mem_cgroup *memcg)
+{
+	if (memcg->kmem_state == KMEM_ALLOCATED) {
+		memcg_destroy_kmem_caches(memcg);
+		static_branch_dec(&memcg_kmem_enabled_key);
+		WARN_ON(page_counter_read(&memcg->kmem));
+	}
+	tcp_destroy_cgroup(memcg);
+}
 #else
 static int memcg_update_kmem_limit(struct mem_cgroup *memcg,
 				   unsigned long limit)
 {
 	return -EINVAL;
 }
+static int memcg_init_kmem(struct mem_cgroup *memcg)
+{
+	return 0;
+}
+static void memcg_offline_kmem(struct mem_cgroup *memcg)
+{
+}
+static void memcg_free_kmem(struct mem_cgroup *memcg)
+{
+}
 #endif /* CONFIG_MEMCG_KMEM */
 
 /*
@@ -3555,87 +3631,6 @@ static int mem_cgroup_oom_control_write(struct cgroup_subsys_state *css,
 	return 0;
 }
 
-#ifdef CONFIG_MEMCG_KMEM
-static int memcg_init_kmem(struct mem_cgroup *memcg)
-{
-	int ret;
-
-	ret = memcg_propagate_kmem(memcg);
-	if (ret)
-		return ret;
-
-	return tcp_init_cgroup(memcg);
-}
-
-static void memcg_offline_kmem(struct mem_cgroup *memcg)
-{
-	struct cgroup_subsys_state *css;
-	struct mem_cgroup *parent, *child;
-	int kmemcg_id;
-
-	if (memcg->kmem_state != KMEM_ONLINE)
-		return;
-	/*
-	 * Clear the online state before clearing memcg_caches array
-	 * entries. The slab_mutex in memcg_deactivate_kmem_caches()
-	 * guarantees that no cache will be created for this cgroup
-	 * after we are done (see memcg_create_kmem_cache()).
-	 */
-	memcg->kmem_state = KMEM_ALLOCATED;
-
-	memcg_deactivate_kmem_caches(memcg);
-
-	kmemcg_id = memcg->kmemcg_id;
-	BUG_ON(kmemcg_id < 0);
-
-	parent = parent_mem_cgroup(memcg);
-	if (!parent)
-		parent = root_mem_cgroup;
-
-	/*
-	 * Change kmemcg_id of this cgroup and all its descendants to the
-	 * parent's id, and then move all entries from this cgroup's list_lrus
-	 * to ones of the parent. After we have finished, all list_lrus
-	 * corresponding to this cgroup are guaranteed to remain empty. The
-	 * ordering is imposed by list_lru_node->lock taken by
-	 * memcg_drain_all_list_lrus().
-	 */
-	css_for_each_descendant_pre(css, &memcg->css) {
-		child = mem_cgroup_from_css(css);
-		BUG_ON(child->kmemcg_id != kmemcg_id);
-		child->kmemcg_id = parent->kmemcg_id;
-		if (!memcg->use_hierarchy)
-			break;
-	}
-	memcg_drain_all_list_lrus(kmemcg_id, parent->kmemcg_id);
-
-	memcg_free_cache_id(kmemcg_id);
-}
-
-static void memcg_free_kmem(struct mem_cgroup *memcg)
-{
-	if (memcg->kmem_state == KMEM_ALLOCATED) {
-		memcg_destroy_kmem_caches(memcg);
-		static_branch_dec(&memcg_kmem_enabled_key);
-		WARN_ON(page_counter_read(&memcg->kmem));
-	}
-	tcp_destroy_cgroup(memcg);
-}
-#else
-static int memcg_init_kmem(struct mem_cgroup *memcg, struct cgroup_subsys *ss)
-{
-	return 0;
-}
-
-static void memcg_offline_kmem(struct mem_cgroup *memcg)
-{
-}
-
-static void memcg_free_kmem(struct mem_cgroup *memcg)
-{
-}
-#endif
-
 #ifdef CONFIG_CGROUP_WRITEBACK
 
 struct list_head *mem_cgroup_cgwb_list(struct mem_cgroup *memcg)
-- 
2.6.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 5/8] mm: memcontrol: separate kmem code from legacy tcp accounting code
  2015-12-08 18:34 ` Johannes Weiner
@ 2015-12-08 18:34   ` Johannes Weiner
  -1 siblings, 0 replies; 79+ messages in thread
From: Johannes Weiner @ 2015-12-08 18:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Michal Hocko, Vladimir Davydov, linux-mm, cgroups, linux-kernel,
	kernel-team

The cgroup2 memory controller will include important in-kernel memory
consumers per default, including socket memory, but it will no longer
carry the historic tcp control interface.

Separate the kmem state init from the tcp control interface init in
preparation for that.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 mm/memcontrol.c | 33 ++++++++++++---------------------
 1 file changed, 12 insertions(+), 21 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 5118618..55a3f07 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2925,17 +2925,6 @@ static int memcg_propagate_kmem(struct mem_cgroup *memcg)
 	return ret;
 }
 
-static int memcg_init_kmem(struct mem_cgroup *memcg)
-{
-	int ret;
-
-	ret = memcg_propagate_kmem(memcg);
-	if (ret)
-		return ret;
-
-	return tcp_init_cgroup(memcg);
-}
-
 static void memcg_offline_kmem(struct mem_cgroup *memcg)
 {
 	struct cgroup_subsys_state *css;
@@ -2988,7 +2977,6 @@ static void memcg_free_kmem(struct mem_cgroup *memcg)
 		static_branch_dec(&memcg_kmem_enabled_key);
 		WARN_ON(page_counter_read(&memcg->kmem));
 	}
-	tcp_destroy_cgroup(memcg);
 }
 #else
 static int memcg_update_kmem_limit(struct mem_cgroup *memcg,
@@ -2996,16 +2984,9 @@ static int memcg_update_kmem_limit(struct mem_cgroup *memcg,
 {
 	return -EINVAL;
 }
-static int memcg_init_kmem(struct mem_cgroup *memcg)
-{
-	return 0;
-}
 static void memcg_offline_kmem(struct mem_cgroup *memcg)
 {
 }
-static void memcg_free_kmem(struct mem_cgroup *memcg)
-{
-}
 #endif /* CONFIG_MEMCG_KMEM */
 
 /*
@@ -4241,9 +4222,14 @@ mem_cgroup_css_online(struct cgroup_subsys_state *css)
 	}
 	mutex_unlock(&memcg_create_mutex);
 
-	ret = memcg_init_kmem(memcg);
+#ifdef CONFIG_MEMCG_KMEM
+	ret = memcg_propagate_kmem(memcg);
 	if (ret)
 		return ret;
+	ret = tcp_init_cgroup(memcg);
+	if (ret)
+		return ret;
+#endif
 
 #ifdef CONFIG_INET
 	if (cgroup_subsys_on_dfl(memory_cgrp_subsys) && !cgroup_memory_nosocket)
@@ -4288,11 +4274,16 @@ static void mem_cgroup_css_free(struct cgroup_subsys_state *css)
 {
 	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
 
-	memcg_free_kmem(memcg);
 #ifdef CONFIG_INET
 	if (cgroup_subsys_on_dfl(memory_cgrp_subsys) && !cgroup_memory_nosocket)
 		static_branch_dec(&memcg_sockets_enabled_key);
 #endif
+
+#ifdef CONFIG_MEMCG_KMEM
+	memcg_free_kmem(memcg);
+	tcp_destroy_cgroup(memcg);
+#endif
+
 	__mem_cgroup_free(memcg);
 }
 
-- 
2.6.3


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 5/8] mm: memcontrol: separate kmem code from legacy tcp accounting code
@ 2015-12-08 18:34   ` Johannes Weiner
  0 siblings, 0 replies; 79+ messages in thread
From: Johannes Weiner @ 2015-12-08 18:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Michal Hocko, Vladimir Davydov, linux-mm, cgroups, linux-kernel,
	kernel-team

The cgroup2 memory controller will include important in-kernel memory
consumers per default, including socket memory, but it will no longer
carry the historic tcp control interface.

Separate the kmem state init from the tcp control interface init in
preparation for that.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 mm/memcontrol.c | 33 ++++++++++++---------------------
 1 file changed, 12 insertions(+), 21 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 5118618..55a3f07 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2925,17 +2925,6 @@ static int memcg_propagate_kmem(struct mem_cgroup *memcg)
 	return ret;
 }
 
-static int memcg_init_kmem(struct mem_cgroup *memcg)
-{
-	int ret;
-
-	ret = memcg_propagate_kmem(memcg);
-	if (ret)
-		return ret;
-
-	return tcp_init_cgroup(memcg);
-}
-
 static void memcg_offline_kmem(struct mem_cgroup *memcg)
 {
 	struct cgroup_subsys_state *css;
@@ -2988,7 +2977,6 @@ static void memcg_free_kmem(struct mem_cgroup *memcg)
 		static_branch_dec(&memcg_kmem_enabled_key);
 		WARN_ON(page_counter_read(&memcg->kmem));
 	}
-	tcp_destroy_cgroup(memcg);
 }
 #else
 static int memcg_update_kmem_limit(struct mem_cgroup *memcg,
@@ -2996,16 +2984,9 @@ static int memcg_update_kmem_limit(struct mem_cgroup *memcg,
 {
 	return -EINVAL;
 }
-static int memcg_init_kmem(struct mem_cgroup *memcg)
-{
-	return 0;
-}
 static void memcg_offline_kmem(struct mem_cgroup *memcg)
 {
 }
-static void memcg_free_kmem(struct mem_cgroup *memcg)
-{
-}
 #endif /* CONFIG_MEMCG_KMEM */
 
 /*
@@ -4241,9 +4222,14 @@ mem_cgroup_css_online(struct cgroup_subsys_state *css)
 	}
 	mutex_unlock(&memcg_create_mutex);
 
-	ret = memcg_init_kmem(memcg);
+#ifdef CONFIG_MEMCG_KMEM
+	ret = memcg_propagate_kmem(memcg);
 	if (ret)
 		return ret;
+	ret = tcp_init_cgroup(memcg);
+	if (ret)
+		return ret;
+#endif
 
 #ifdef CONFIG_INET
 	if (cgroup_subsys_on_dfl(memory_cgrp_subsys) && !cgroup_memory_nosocket)
@@ -4288,11 +4274,16 @@ static void mem_cgroup_css_free(struct cgroup_subsys_state *css)
 {
 	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
 
-	memcg_free_kmem(memcg);
 #ifdef CONFIG_INET
 	if (cgroup_subsys_on_dfl(memory_cgrp_subsys) && !cgroup_memory_nosocket)
 		static_branch_dec(&memcg_sockets_enabled_key);
 #endif
+
+#ifdef CONFIG_MEMCG_KMEM
+	memcg_free_kmem(memcg);
+	tcp_destroy_cgroup(memcg);
+#endif
+
 	__mem_cgroup_free(memcg);
 }
 
-- 
2.6.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 6/8] mm: memcontrol: move kmem accounting code to CONFIG_MEMCG
  2015-12-08 18:34 ` Johannes Weiner
@ 2015-12-08 18:34   ` Johannes Weiner
  -1 siblings, 0 replies; 79+ messages in thread
From: Johannes Weiner @ 2015-12-08 18:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Michal Hocko, Vladimir Davydov, linux-mm, cgroups, linux-kernel,
	kernel-team

The cgroup2 memory controller will account important in-kernel memory
consumers per default. Move all necessary components to CONFIG_MEMCG.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 include/linux/list_lru.h   |   4 +-
 include/linux/memcontrol.h | 317 ++++++++++++++++++++++-----------------------
 include/linux/sched.h      |   2 -
 include/linux/slab.h       |   2 +-
 include/linux/slab_def.h   |   3 +-
 include/linux/slub_def.h   |   2 +-
 mm/list_lru.c              |  12 +-
 mm/memcontrol.c            |  54 ++++----
 mm/slab.h                  |   6 +-
 mm/slab_common.c           |  10 +-
 mm/slub.c                  |  10 +-
 11 files changed, 206 insertions(+), 216 deletions(-)

diff --git a/include/linux/list_lru.h b/include/linux/list_lru.h
index 2a6b994..3c66b96 100644
--- a/include/linux/list_lru.h
+++ b/include/linux/list_lru.h
@@ -40,7 +40,7 @@ struct list_lru_node {
 	spinlock_t		lock;
 	/* global list, used for the root cgroup in cgroup aware lrus */
 	struct list_lru_one	lru;
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
 	/* for cgroup aware lrus points to per cgroup lists, otherwise NULL */
 	struct list_lru_memcg	*memcg_lrus;
 #endif
@@ -48,7 +48,7 @@ struct list_lru_node {
 
 struct list_lru {
 	struct list_lru_node	*node;
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
 	struct list_head	list;
 #endif
 };
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 54dab4d..80f38da 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -236,11 +236,10 @@ struct mem_cgroup {
 #if defined(CONFIG_MEMCG_KMEM) && defined(CONFIG_INET)
 	struct cg_proto tcp_mem;
 #endif
-#if defined(CONFIG_MEMCG_KMEM)
+
         /* Index in the kmem_cache->memcg_params.memcg_caches array */
 	int kmemcg_id;
 	enum memcg_kmem_state kmem_state;
-#endif
 
 	int last_scanned_node;
 #if MAX_NUMNODES > 1
@@ -505,6 +504,117 @@ out:
 void mem_cgroup_split_huge_fixup(struct page *head);
 #endif
 
+extern struct static_key_false memcg_kmem_enabled_key;
+
+extern int memcg_nr_cache_ids;
+void memcg_get_cache_ids(void);
+void memcg_put_cache_ids(void);
+
+/*
+ * Helper macro to loop through all memcg-specific caches. Callers must still
+ * check if the cache is valid (it is either valid or NULL).
+ * the slab_mutex must be held when looping through those caches
+ */
+#define for_each_memcg_cache_index(_idx)	\
+	for ((_idx) = 0; (_idx) < memcg_nr_cache_ids; (_idx)++)
+
+static inline bool memcg_kmem_enabled(void)
+{
+	return static_branch_unlikely(&memcg_kmem_enabled_key);
+}
+
+static inline bool memcg_kmem_online(struct mem_cgroup *memcg)
+{
+	return memcg->kmem_state == KMEM_ONLINE;
+}
+
+/*
+ * In general, we'll do everything in our power to not incur in any overhead
+ * for non-memcg users for the kmem functions. Not even a function call, if we
+ * can avoid it.
+ *
+ * Therefore, we'll inline all those functions so that in the best case, we'll
+ * see that kmemcg is off for everybody and proceed quickly.  If it is on,
+ * we'll still do most of the flag checking inline. We check a lot of
+ * conditions, but because they are pretty simple, they are expected to be
+ * fast.
+ */
+int __memcg_kmem_charge_memcg(struct page *page, gfp_t gfp, int order,
+			      struct mem_cgroup *memcg);
+int __memcg_kmem_charge(struct page *page, gfp_t gfp, int order);
+void __memcg_kmem_uncharge(struct page *page, int order);
+
+/*
+ * helper for acessing a memcg's index. It will be used as an index in the
+ * child cache array in kmem_cache, and also to derive its name. This function
+ * will return -1 when this is not a kmem-limited memcg.
+ */
+static inline int memcg_cache_id(struct mem_cgroup *memcg)
+{
+	return memcg ? memcg->kmemcg_id : -1;
+}
+
+struct kmem_cache *__memcg_kmem_get_cache(struct kmem_cache *cachep, gfp_t gfp);
+void __memcg_kmem_put_cache(struct kmem_cache *cachep);
+
+static inline bool __memcg_kmem_bypass(void)
+{
+	if (!memcg_kmem_enabled())
+		return true;
+	if (in_interrupt() || (!current->mm) || (current->flags & PF_KTHREAD))
+		return true;
+	return false;
+}
+
+/**
+ * memcg_kmem_charge: charge a kmem page
+ * @page: page to charge
+ * @gfp: reclaim mode
+ * @order: allocation order
+ *
+ * Returns 0 on success, an error code on failure.
+ */
+static __always_inline int memcg_kmem_charge(struct page *page,
+					     gfp_t gfp, int order)
+{
+	if (__memcg_kmem_bypass())
+		return 0;
+	if (!(gfp & __GFP_ACCOUNT))
+		return 0;
+	return __memcg_kmem_charge(page, gfp, order);
+}
+
+/**
+ * memcg_kmem_uncharge: uncharge a kmem page
+ * @page: page to uncharge
+ * @order: allocation order
+ */
+static __always_inline void memcg_kmem_uncharge(struct page *page, int order)
+{
+	if (memcg_kmem_enabled())
+		__memcg_kmem_uncharge(page, order);
+}
+
+/**
+ * memcg_kmem_get_cache: selects the correct per-memcg cache for allocation
+ * @cachep: the original global kmem cache
+ *
+ * All memory allocated from a per-memcg cache is charged to the owner memcg.
+ */
+static __always_inline struct kmem_cache *
+memcg_kmem_get_cache(struct kmem_cache *cachep, gfp_t gfp)
+{
+	if (__memcg_kmem_bypass())
+		return cachep;
+	return __memcg_kmem_get_cache(cachep, gfp);
+}
+
+static __always_inline void memcg_kmem_put_cache(struct kmem_cache *cachep)
+{
+	if (memcg_kmem_enabled())
+		__memcg_kmem_put_cache(cachep);
+}
+
 #else /* CONFIG_MEMCG */
 struct mem_cgroup;
 
@@ -680,6 +790,52 @@ static inline
 void mem_cgroup_count_vm_event(struct mm_struct *mm, enum vm_event_item idx)
 {
 }
+
+#define for_each_memcg_cache_index(_idx)	\
+	for (; NULL; )
+
+static inline bool memcg_kmem_enabled(void)
+{
+	return false;
+}
+
+static inline bool memcg_kmem_online(struct mem_cgroup *memcg)
+{
+	return false;
+}
+
+static inline int memcg_kmem_charge(struct page *page, gfp_t gfp, int order)
+{
+	return 0;
+}
+
+static inline void memcg_kmem_uncharge(struct page *page, int order)
+{
+}
+
+static inline int memcg_cache_id(struct mem_cgroup *memcg)
+{
+	return -1;
+}
+
+static inline void memcg_get_cache_ids(void)
+{
+}
+
+static inline void memcg_put_cache_ids(void)
+{
+}
+
+static inline struct kmem_cache *
+memcg_kmem_get_cache(struct kmem_cache *cachep, gfp_t gfp)
+{
+	return cachep;
+}
+
+static inline void memcg_kmem_put_cache(struct kmem_cache *cachep)
+{
+}
+
 #endif /* CONFIG_MEMCG */
 
 #ifdef CONFIG_CGROUP_WRITEBACK
@@ -735,161 +891,4 @@ static inline bool mem_cgroup_under_socket_pressure(struct mem_cgroup *memcg)
 }
 #endif
 
-#ifdef CONFIG_MEMCG_KMEM
-extern struct static_key_false memcg_kmem_enabled_key;
-
-extern int memcg_nr_cache_ids;
-void memcg_get_cache_ids(void);
-void memcg_put_cache_ids(void);
-
-/*
- * Helper macro to loop through all memcg-specific caches. Callers must still
- * check if the cache is valid (it is either valid or NULL).
- * the slab_mutex must be held when looping through those caches
- */
-#define for_each_memcg_cache_index(_idx)	\
-	for ((_idx) = 0; (_idx) < memcg_nr_cache_ids; (_idx)++)
-
-static inline bool memcg_kmem_enabled(void)
-{
-	return static_branch_unlikely(&memcg_kmem_enabled_key);
-}
-
-static inline bool memcg_kmem_online(struct mem_cgroup *memcg)
-{
-	return memcg->kmem_state == KMEM_ONLINE;
-}
-
-/*
- * In general, we'll do everything in our power to not incur in any overhead
- * for non-memcg users for the kmem functions. Not even a function call, if we
- * can avoid it.
- *
- * Therefore, we'll inline all those functions so that in the best case, we'll
- * see that kmemcg is off for everybody and proceed quickly.  If it is on,
- * we'll still do most of the flag checking inline. We check a lot of
- * conditions, but because they are pretty simple, they are expected to be
- * fast.
- */
-int __memcg_kmem_charge_memcg(struct page *page, gfp_t gfp, int order,
-			      struct mem_cgroup *memcg);
-int __memcg_kmem_charge(struct page *page, gfp_t gfp, int order);
-void __memcg_kmem_uncharge(struct page *page, int order);
-
-/*
- * helper for acessing a memcg's index. It will be used as an index in the
- * child cache array in kmem_cache, and also to derive its name. This function
- * will return -1 when this is not a kmem-limited memcg.
- */
-static inline int memcg_cache_id(struct mem_cgroup *memcg)
-{
-	return memcg ? memcg->kmemcg_id : -1;
-}
-
-struct kmem_cache *__memcg_kmem_get_cache(struct kmem_cache *cachep, gfp_t gfp);
-void __memcg_kmem_put_cache(struct kmem_cache *cachep);
-
-static inline bool __memcg_kmem_bypass(void)
-{
-	if (!memcg_kmem_enabled())
-		return true;
-	if (in_interrupt() || (!current->mm) || (current->flags & PF_KTHREAD))
-		return true;
-	return false;
-}
-
-/**
- * memcg_kmem_charge: charge a kmem page
- * @page: page to charge
- * @gfp: reclaim mode
- * @order: allocation order
- *
- * Returns 0 on success, an error code on failure.
- */
-static __always_inline int memcg_kmem_charge(struct page *page,
-					     gfp_t gfp, int order)
-{
-	if (__memcg_kmem_bypass())
-		return 0;
-	if (!(gfp & __GFP_ACCOUNT))
-		return 0;
-	return __memcg_kmem_charge(page, gfp, order);
-}
-
-/**
- * memcg_kmem_uncharge: uncharge a kmem page
- * @page: page to uncharge
- * @order: allocation order
- */
-static __always_inline void memcg_kmem_uncharge(struct page *page, int order)
-{
-	if (memcg_kmem_enabled())
-		__memcg_kmem_uncharge(page, order);
-}
-
-/**
- * memcg_kmem_get_cache: selects the correct per-memcg cache for allocation
- * @cachep: the original global kmem cache
- *
- * All memory allocated from a per-memcg cache is charged to the owner memcg.
- */
-static __always_inline struct kmem_cache *
-memcg_kmem_get_cache(struct kmem_cache *cachep, gfp_t gfp)
-{
-	if (__memcg_kmem_bypass())
-		return cachep;
-	return __memcg_kmem_get_cache(cachep, gfp);
-}
-
-static __always_inline void memcg_kmem_put_cache(struct kmem_cache *cachep)
-{
-	if (memcg_kmem_enabled())
-		__memcg_kmem_put_cache(cachep);
-}
-#else
-#define for_each_memcg_cache_index(_idx)	\
-	for (; NULL; )
-
-static inline bool memcg_kmem_enabled(void)
-{
-	return false;
-}
-
-static inline bool memcg_kmem_online(struct mem_cgroup *memcg)
-{
-	return false;
-}
-
-static inline int memcg_kmem_charge(struct page *page, gfp_t gfp, int order)
-{
-	return 0;
-}
-
-static inline void memcg_kmem_uncharge(struct page *page, int order)
-{
-}
-
-static inline int memcg_cache_id(struct mem_cgroup *memcg)
-{
-	return -1;
-}
-
-static inline void memcg_get_cache_ids(void)
-{
-}
-
-static inline void memcg_put_cache_ids(void)
-{
-}
-
-static inline struct kmem_cache *
-memcg_kmem_get_cache(struct kmem_cache *cachep, gfp_t gfp)
-{
-	return cachep;
-}
-
-static inline void memcg_kmem_put_cache(struct kmem_cache *cachep)
-{
-}
-#endif /* CONFIG_MEMCG_KMEM */
 #endif /* _LINUX_MEMCONTROL_H */
diff --git a/include/linux/sched.h b/include/linux/sched.h
index edad7a4..62b5a6e 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1465,8 +1465,6 @@ struct task_struct {
 	unsigned sched_migrated:1;
 #ifdef CONFIG_MEMCG
 	unsigned memcg_may_oom:1;
-#endif
-#ifdef CONFIG_MEMCG_KMEM
 	unsigned memcg_kmem_skip_account:1;
 #endif
 #ifdef CONFIG_COMPAT_BRK
diff --git a/include/linux/slab.h b/include/linux/slab.h
index 3ffee74..b0a7034 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -86,7 +86,7 @@
 #else
 # define SLAB_FAILSLAB		0x00000000UL
 #endif
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
 # define SLAB_ACCOUNT		0x04000000UL	/* Account to memcg */
 #else
 # define SLAB_ACCOUNT		0x00000000UL
diff --git a/include/linux/slab_def.h b/include/linux/slab_def.h
index 33d0490..cf139d3 100644
--- a/include/linux/slab_def.h
+++ b/include/linux/slab_def.h
@@ -69,7 +69,8 @@ struct kmem_cache {
 	 */
 	int obj_offset;
 #endif /* CONFIG_DEBUG_SLAB */
-#ifdef CONFIG_MEMCG_KMEM
+
+#ifdef CONFIG_MEMCG
 	struct memcg_cache_params memcg_params;
 #endif
 
diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h
index 3388511..b7e57927 100644
--- a/include/linux/slub_def.h
+++ b/include/linux/slub_def.h
@@ -84,7 +84,7 @@ struct kmem_cache {
 #ifdef CONFIG_SYSFS
 	struct kobject kobj;	/* For sysfs */
 #endif
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
 	struct memcg_cache_params memcg_params;
 	int max_attr_size; /* for propagation, maximum size of a stored attr */
 #ifdef CONFIG_SYSFS
diff --git a/mm/list_lru.c b/mm/list_lru.c
index afc71ea..568267d 100644
--- a/mm/list_lru.c
+++ b/mm/list_lru.c
@@ -12,7 +12,7 @@
 #include <linux/mutex.h>
 #include <linux/memcontrol.h>
 
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
 static LIST_HEAD(list_lrus);
 static DEFINE_MUTEX(list_lrus_mutex);
 
@@ -37,9 +37,9 @@ static void list_lru_register(struct list_lru *lru)
 static void list_lru_unregister(struct list_lru *lru)
 {
 }
-#endif /* CONFIG_MEMCG_KMEM */
+#endif /* CONFIG_MEMCG */
 
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
 static inline bool list_lru_memcg_aware(struct list_lru *lru)
 {
 	/*
@@ -104,7 +104,7 @@ list_lru_from_kmem(struct list_lru_node *nlru, void *ptr)
 {
 	return &nlru->lru;
 }
-#endif /* CONFIG_MEMCG_KMEM */
+#endif /* CONFIG_MEMCG */
 
 bool list_lru_add(struct list_lru *lru, struct list_head *item)
 {
@@ -292,7 +292,7 @@ static void init_one_lru(struct list_lru_one *l)
 	l->nr_items = 0;
 }
 
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
 static void __memcg_destroy_list_lru_node(struct list_lru_memcg *memcg_lrus,
 					  int begin, int end)
 {
@@ -529,7 +529,7 @@ static int memcg_init_list_lru(struct list_lru *lru, bool memcg_aware)
 static void memcg_destroy_list_lru(struct list_lru *lru)
 {
 }
-#endif /* CONFIG_MEMCG_KMEM */
+#endif /* CONFIG_MEMCG */
 
 int __list_lru_init(struct list_lru *lru, bool memcg_aware,
 		    struct lock_class_key *key)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 55a3f07..ab72c47 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -297,7 +297,6 @@ static inline struct mem_cgroup *mem_cgroup_from_id(unsigned short id)
 	return mem_cgroup_from_css(css);
 }
 
-#ifdef CONFIG_MEMCG_KMEM
 /*
  * This will be the memcg's index in each cache's ->memcg_params.memcg_caches.
  * The main reason for not using cgroup id for this:
@@ -349,8 +348,6 @@ void memcg_put_cache_ids(void)
 DEFINE_STATIC_KEY_FALSE(memcg_kmem_enabled_key);
 EXPORT_SYMBOL(memcg_kmem_enabled_key);
 
-#endif /* CONFIG_MEMCG_KMEM */
-
 static struct mem_cgroup_per_zone *
 mem_cgroup_zone_zoneinfo(struct mem_cgroup *memcg, struct zone *zone)
 {
@@ -2182,7 +2179,6 @@ static void commit_charge(struct page *page, struct mem_cgroup *memcg,
 		unlock_page_lru(page, isolated);
 }
 
-#ifdef CONFIG_MEMCG_KMEM
 static int memcg_alloc_cache_id(void)
 {
 	int id, size;
@@ -2403,7 +2399,6 @@ void __memcg_kmem_uncharge(struct page *page, int order)
 	page->mem_cgroup = NULL;
 	css_put_many(&memcg->css, nr_pages);
 }
-#endif /* CONFIG_MEMCG_KMEM */
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 
@@ -2839,7 +2834,6 @@ static u64 mem_cgroup_read_u64(struct cgroup_subsys_state *css,
 	}
 }
 
-#ifdef CONFIG_MEMCG_KMEM
 static int memcg_online_kmem(struct mem_cgroup *memcg)
 {
 	int err = 0;
@@ -2887,24 +2881,6 @@ out:
 	return err;
 }
 
-static int memcg_update_kmem_limit(struct mem_cgroup *memcg,
-				   unsigned long limit)
-{
-	int ret;
-
-	mutex_lock(&memcg_limit_mutex);
-	/* Top-level cgroup doesn't propagate from root */
-	if (!memcg_kmem_online(memcg)) {
-		ret = memcg_online_kmem(memcg);
-		if (ret)
-			goto out;
-	}
-	ret = page_counter_limit(&memcg->kmem, limit);
-out:
-	mutex_unlock(&memcg_limit_mutex);
-	return ret;
-}
-
 static int memcg_propagate_kmem(struct mem_cgroup *memcg)
 {
 	int ret = 0;
@@ -2978,14 +2954,30 @@ static void memcg_free_kmem(struct mem_cgroup *memcg)
 		WARN_ON(page_counter_read(&memcg->kmem));
 	}
 }
-#else
+
+#ifdef CONFIG_MEMCG_KMEM
 static int memcg_update_kmem_limit(struct mem_cgroup *memcg,
 				   unsigned long limit)
 {
-	return -EINVAL;
+	int ret;
+
+	mutex_lock(&memcg_limit_mutex);
+	/* Top-level cgroup doesn't propagate from root */
+	if (!memcg_kmem_online(memcg)) {
+		ret = memcg_online_kmem(memcg);
+		if (ret)
+			goto out;
+	}
+	ret = page_counter_limit(&memcg->kmem, limit);
+out:
+	mutex_unlock(&memcg_limit_mutex);
+	return ret;
 }
-static void memcg_offline_kmem(struct mem_cgroup *memcg)
+#else
+static int memcg_update_kmem_limit(struct mem_cgroup *memcg,
+				   unsigned long limit)
 {
+	return -EINVAL;
 }
 #endif /* CONFIG_MEMCG_KMEM */
 
@@ -4160,9 +4152,7 @@ mem_cgroup_css_alloc(struct cgroup_subsys_state *parent_css)
 	vmpressure_init(&memcg->vmpressure);
 	INIT_LIST_HEAD(&memcg->event_list);
 	spin_lock_init(&memcg->event_list_lock);
-#ifdef CONFIG_MEMCG_KMEM
 	memcg->kmemcg_id = -1;
-#endif
 #ifdef CONFIG_CGROUP_WRITEBACK
 	INIT_LIST_HEAD(&memcg->cgwb_list);
 #endif
@@ -4222,10 +4212,11 @@ mem_cgroup_css_online(struct cgroup_subsys_state *css)
 	}
 	mutex_unlock(&memcg_create_mutex);
 
-#ifdef CONFIG_MEMCG_KMEM
 	ret = memcg_propagate_kmem(memcg);
 	if (ret)
 		return ret;
+
+#ifdef CONFIG_MEMCG_KMEM
 	ret = tcp_init_cgroup(memcg);
 	if (ret)
 		return ret;
@@ -4279,8 +4270,9 @@ static void mem_cgroup_css_free(struct cgroup_subsys_state *css)
 		static_branch_dec(&memcg_sockets_enabled_key);
 #endif
 
-#ifdef CONFIG_MEMCG_KMEM
 	memcg_free_kmem(memcg);
+
+#ifdef CONFIG_MEMCG_KMEM
 	tcp_destroy_cgroup(memcg);
 #endif
 
diff --git a/mm/slab.h b/mm/slab.h
index c63b869..5adec08 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -173,7 +173,7 @@ ssize_t slabinfo_write(struct file *file, const char __user *buffer,
 void __kmem_cache_free_bulk(struct kmem_cache *, size_t, void **);
 int __kmem_cache_alloc_bulk(struct kmem_cache *, gfp_t, size_t, void **);
 
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
 /*
  * Iterate over all memcg caches of the given root cache. The caller must hold
  * slab_mutex.
@@ -251,7 +251,7 @@ static __always_inline int memcg_charge_slab(struct page *page,
 
 extern void slab_init_memcg_params(struct kmem_cache *);
 
-#else /* !CONFIG_MEMCG_KMEM */
+#else /* !CONFIG_MEMCG */
 
 #define for_each_memcg_cache(iter, root) \
 	for ((void)(iter), (void)(root); 0; )
@@ -292,7 +292,7 @@ static inline int memcg_charge_slab(struct page *page, gfp_t gfp, int order,
 static inline void slab_init_memcg_params(struct kmem_cache *s)
 {
 }
-#endif /* CONFIG_MEMCG_KMEM */
+#endif /* CONFIG_MEMCG */
 
 static inline struct kmem_cache *cache_from_obj(struct kmem_cache *s, void *x)
 {
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 8c262e6..34103b8 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -128,7 +128,7 @@ int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t nr,
 	return i;
 }
 
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
 void slab_init_memcg_params(struct kmem_cache *s)
 {
 	s->memcg_params.is_root_cache = true;
@@ -221,7 +221,7 @@ static inline int init_memcg_params(struct kmem_cache *s,
 static inline void destroy_memcg_params(struct kmem_cache *s)
 {
 }
-#endif /* CONFIG_MEMCG_KMEM */
+#endif /* CONFIG_MEMCG */
 
 /*
  * Find a mergeable slab cache
@@ -477,7 +477,7 @@ static void release_caches(struct list_head *release, bool need_rcu_barrier)
 	}
 }
 
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
 /*
  * memcg_create_kmem_cache - Create a cache for a memory cgroup.
  * @memcg: The memory cgroup the new cache is for.
@@ -689,7 +689,7 @@ static inline int shutdown_memcg_caches(struct kmem_cache *s,
 {
 	return 0;
 }
-#endif /* CONFIG_MEMCG_KMEM */
+#endif /* CONFIG_MEMCG */
 
 void slab_kmem_cache_release(struct kmem_cache *s)
 {
@@ -1123,7 +1123,7 @@ static int slab_show(struct seq_file *m, void *p)
 	return 0;
 }
 
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
 int memcg_slab_show(struct seq_file *m, void *p)
 {
 	struct kmem_cache *s = list_entry(p, struct kmem_cache, list);
diff --git a/mm/slub.c b/mm/slub.c
index b21fd24..2e1355a 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -5207,7 +5207,7 @@ static ssize_t slab_attr_store(struct kobject *kobj,
 		return -EIO;
 
 	err = attribute->store(s, buf, len);
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
 	if (slab_state >= FULL && err >= 0 && is_root_cache(s)) {
 		struct kmem_cache *c;
 
@@ -5242,7 +5242,7 @@ static ssize_t slab_attr_store(struct kobject *kobj,
 
 static void memcg_propagate_slab_attrs(struct kmem_cache *s)
 {
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
 	int i;
 	char *buffer = NULL;
 	struct kmem_cache *root_cache;
@@ -5328,7 +5328,7 @@ static struct kset *slab_kset;
 
 static inline struct kset *cache_kset(struct kmem_cache *s)
 {
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
 	if (!is_root_cache(s))
 		return s->memcg_params.root_cache->memcg_kset;
 #endif
@@ -5405,7 +5405,7 @@ static int sysfs_slab_add(struct kmem_cache *s)
 	if (err)
 		goto out_del_kobj;
 
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
 	if (is_root_cache(s)) {
 		s->memcg_kset = kset_create_and_add("cgroup", NULL, &s->kobj);
 		if (!s->memcg_kset) {
@@ -5438,7 +5438,7 @@ void sysfs_slab_remove(struct kmem_cache *s)
 		 */
 		return;
 
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
 	kset_unregister(s->memcg_kset);
 #endif
 	kobject_uevent(&s->kobj, KOBJ_REMOVE);
-- 
2.6.3


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 6/8] mm: memcontrol: move kmem accounting code to CONFIG_MEMCG
@ 2015-12-08 18:34   ` Johannes Weiner
  0 siblings, 0 replies; 79+ messages in thread
From: Johannes Weiner @ 2015-12-08 18:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Michal Hocko, Vladimir Davydov, linux-mm, cgroups, linux-kernel,
	kernel-team

The cgroup2 memory controller will account important in-kernel memory
consumers per default. Move all necessary components to CONFIG_MEMCG.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 include/linux/list_lru.h   |   4 +-
 include/linux/memcontrol.h | 317 ++++++++++++++++++++++-----------------------
 include/linux/sched.h      |   2 -
 include/linux/slab.h       |   2 +-
 include/linux/slab_def.h   |   3 +-
 include/linux/slub_def.h   |   2 +-
 mm/list_lru.c              |  12 +-
 mm/memcontrol.c            |  54 ++++----
 mm/slab.h                  |   6 +-
 mm/slab_common.c           |  10 +-
 mm/slub.c                  |  10 +-
 11 files changed, 206 insertions(+), 216 deletions(-)

diff --git a/include/linux/list_lru.h b/include/linux/list_lru.h
index 2a6b994..3c66b96 100644
--- a/include/linux/list_lru.h
+++ b/include/linux/list_lru.h
@@ -40,7 +40,7 @@ struct list_lru_node {
 	spinlock_t		lock;
 	/* global list, used for the root cgroup in cgroup aware lrus */
 	struct list_lru_one	lru;
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
 	/* for cgroup aware lrus points to per cgroup lists, otherwise NULL */
 	struct list_lru_memcg	*memcg_lrus;
 #endif
@@ -48,7 +48,7 @@ struct list_lru_node {
 
 struct list_lru {
 	struct list_lru_node	*node;
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
 	struct list_head	list;
 #endif
 };
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 54dab4d..80f38da 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -236,11 +236,10 @@ struct mem_cgroup {
 #if defined(CONFIG_MEMCG_KMEM) && defined(CONFIG_INET)
 	struct cg_proto tcp_mem;
 #endif
-#if defined(CONFIG_MEMCG_KMEM)
+
         /* Index in the kmem_cache->memcg_params.memcg_caches array */
 	int kmemcg_id;
 	enum memcg_kmem_state kmem_state;
-#endif
 
 	int last_scanned_node;
 #if MAX_NUMNODES > 1
@@ -505,6 +504,117 @@ out:
 void mem_cgroup_split_huge_fixup(struct page *head);
 #endif
 
+extern struct static_key_false memcg_kmem_enabled_key;
+
+extern int memcg_nr_cache_ids;
+void memcg_get_cache_ids(void);
+void memcg_put_cache_ids(void);
+
+/*
+ * Helper macro to loop through all memcg-specific caches. Callers must still
+ * check if the cache is valid (it is either valid or NULL).
+ * the slab_mutex must be held when looping through those caches
+ */
+#define for_each_memcg_cache_index(_idx)	\
+	for ((_idx) = 0; (_idx) < memcg_nr_cache_ids; (_idx)++)
+
+static inline bool memcg_kmem_enabled(void)
+{
+	return static_branch_unlikely(&memcg_kmem_enabled_key);
+}
+
+static inline bool memcg_kmem_online(struct mem_cgroup *memcg)
+{
+	return memcg->kmem_state == KMEM_ONLINE;
+}
+
+/*
+ * In general, we'll do everything in our power to not incur in any overhead
+ * for non-memcg users for the kmem functions. Not even a function call, if we
+ * can avoid it.
+ *
+ * Therefore, we'll inline all those functions so that in the best case, we'll
+ * see that kmemcg is off for everybody and proceed quickly.  If it is on,
+ * we'll still do most of the flag checking inline. We check a lot of
+ * conditions, but because they are pretty simple, they are expected to be
+ * fast.
+ */
+int __memcg_kmem_charge_memcg(struct page *page, gfp_t gfp, int order,
+			      struct mem_cgroup *memcg);
+int __memcg_kmem_charge(struct page *page, gfp_t gfp, int order);
+void __memcg_kmem_uncharge(struct page *page, int order);
+
+/*
+ * helper for acessing a memcg's index. It will be used as an index in the
+ * child cache array in kmem_cache, and also to derive its name. This function
+ * will return -1 when this is not a kmem-limited memcg.
+ */
+static inline int memcg_cache_id(struct mem_cgroup *memcg)
+{
+	return memcg ? memcg->kmemcg_id : -1;
+}
+
+struct kmem_cache *__memcg_kmem_get_cache(struct kmem_cache *cachep, gfp_t gfp);
+void __memcg_kmem_put_cache(struct kmem_cache *cachep);
+
+static inline bool __memcg_kmem_bypass(void)
+{
+	if (!memcg_kmem_enabled())
+		return true;
+	if (in_interrupt() || (!current->mm) || (current->flags & PF_KTHREAD))
+		return true;
+	return false;
+}
+
+/**
+ * memcg_kmem_charge: charge a kmem page
+ * @page: page to charge
+ * @gfp: reclaim mode
+ * @order: allocation order
+ *
+ * Returns 0 on success, an error code on failure.
+ */
+static __always_inline int memcg_kmem_charge(struct page *page,
+					     gfp_t gfp, int order)
+{
+	if (__memcg_kmem_bypass())
+		return 0;
+	if (!(gfp & __GFP_ACCOUNT))
+		return 0;
+	return __memcg_kmem_charge(page, gfp, order);
+}
+
+/**
+ * memcg_kmem_uncharge: uncharge a kmem page
+ * @page: page to uncharge
+ * @order: allocation order
+ */
+static __always_inline void memcg_kmem_uncharge(struct page *page, int order)
+{
+	if (memcg_kmem_enabled())
+		__memcg_kmem_uncharge(page, order);
+}
+
+/**
+ * memcg_kmem_get_cache: selects the correct per-memcg cache for allocation
+ * @cachep: the original global kmem cache
+ *
+ * All memory allocated from a per-memcg cache is charged to the owner memcg.
+ */
+static __always_inline struct kmem_cache *
+memcg_kmem_get_cache(struct kmem_cache *cachep, gfp_t gfp)
+{
+	if (__memcg_kmem_bypass())
+		return cachep;
+	return __memcg_kmem_get_cache(cachep, gfp);
+}
+
+static __always_inline void memcg_kmem_put_cache(struct kmem_cache *cachep)
+{
+	if (memcg_kmem_enabled())
+		__memcg_kmem_put_cache(cachep);
+}
+
 #else /* CONFIG_MEMCG */
 struct mem_cgroup;
 
@@ -680,6 +790,52 @@ static inline
 void mem_cgroup_count_vm_event(struct mm_struct *mm, enum vm_event_item idx)
 {
 }
+
+#define for_each_memcg_cache_index(_idx)	\
+	for (; NULL; )
+
+static inline bool memcg_kmem_enabled(void)
+{
+	return false;
+}
+
+static inline bool memcg_kmem_online(struct mem_cgroup *memcg)
+{
+	return false;
+}
+
+static inline int memcg_kmem_charge(struct page *page, gfp_t gfp, int order)
+{
+	return 0;
+}
+
+static inline void memcg_kmem_uncharge(struct page *page, int order)
+{
+}
+
+static inline int memcg_cache_id(struct mem_cgroup *memcg)
+{
+	return -1;
+}
+
+static inline void memcg_get_cache_ids(void)
+{
+}
+
+static inline void memcg_put_cache_ids(void)
+{
+}
+
+static inline struct kmem_cache *
+memcg_kmem_get_cache(struct kmem_cache *cachep, gfp_t gfp)
+{
+	return cachep;
+}
+
+static inline void memcg_kmem_put_cache(struct kmem_cache *cachep)
+{
+}
+
 #endif /* CONFIG_MEMCG */
 
 #ifdef CONFIG_CGROUP_WRITEBACK
@@ -735,161 +891,4 @@ static inline bool mem_cgroup_under_socket_pressure(struct mem_cgroup *memcg)
 }
 #endif
 
-#ifdef CONFIG_MEMCG_KMEM
-extern struct static_key_false memcg_kmem_enabled_key;
-
-extern int memcg_nr_cache_ids;
-void memcg_get_cache_ids(void);
-void memcg_put_cache_ids(void);
-
-/*
- * Helper macro to loop through all memcg-specific caches. Callers must still
- * check if the cache is valid (it is either valid or NULL).
- * the slab_mutex must be held when looping through those caches
- */
-#define for_each_memcg_cache_index(_idx)	\
-	for ((_idx) = 0; (_idx) < memcg_nr_cache_ids; (_idx)++)
-
-static inline bool memcg_kmem_enabled(void)
-{
-	return static_branch_unlikely(&memcg_kmem_enabled_key);
-}
-
-static inline bool memcg_kmem_online(struct mem_cgroup *memcg)
-{
-	return memcg->kmem_state == KMEM_ONLINE;
-}
-
-/*
- * In general, we'll do everything in our power to not incur in any overhead
- * for non-memcg users for the kmem functions. Not even a function call, if we
- * can avoid it.
- *
- * Therefore, we'll inline all those functions so that in the best case, we'll
- * see that kmemcg is off for everybody and proceed quickly.  If it is on,
- * we'll still do most of the flag checking inline. We check a lot of
- * conditions, but because they are pretty simple, they are expected to be
- * fast.
- */
-int __memcg_kmem_charge_memcg(struct page *page, gfp_t gfp, int order,
-			      struct mem_cgroup *memcg);
-int __memcg_kmem_charge(struct page *page, gfp_t gfp, int order);
-void __memcg_kmem_uncharge(struct page *page, int order);
-
-/*
- * helper for acessing a memcg's index. It will be used as an index in the
- * child cache array in kmem_cache, and also to derive its name. This function
- * will return -1 when this is not a kmem-limited memcg.
- */
-static inline int memcg_cache_id(struct mem_cgroup *memcg)
-{
-	return memcg ? memcg->kmemcg_id : -1;
-}
-
-struct kmem_cache *__memcg_kmem_get_cache(struct kmem_cache *cachep, gfp_t gfp);
-void __memcg_kmem_put_cache(struct kmem_cache *cachep);
-
-static inline bool __memcg_kmem_bypass(void)
-{
-	if (!memcg_kmem_enabled())
-		return true;
-	if (in_interrupt() || (!current->mm) || (current->flags & PF_KTHREAD))
-		return true;
-	return false;
-}
-
-/**
- * memcg_kmem_charge: charge a kmem page
- * @page: page to charge
- * @gfp: reclaim mode
- * @order: allocation order
- *
- * Returns 0 on success, an error code on failure.
- */
-static __always_inline int memcg_kmem_charge(struct page *page,
-					     gfp_t gfp, int order)
-{
-	if (__memcg_kmem_bypass())
-		return 0;
-	if (!(gfp & __GFP_ACCOUNT))
-		return 0;
-	return __memcg_kmem_charge(page, gfp, order);
-}
-
-/**
- * memcg_kmem_uncharge: uncharge a kmem page
- * @page: page to uncharge
- * @order: allocation order
- */
-static __always_inline void memcg_kmem_uncharge(struct page *page, int order)
-{
-	if (memcg_kmem_enabled())
-		__memcg_kmem_uncharge(page, order);
-}
-
-/**
- * memcg_kmem_get_cache: selects the correct per-memcg cache for allocation
- * @cachep: the original global kmem cache
- *
- * All memory allocated from a per-memcg cache is charged to the owner memcg.
- */
-static __always_inline struct kmem_cache *
-memcg_kmem_get_cache(struct kmem_cache *cachep, gfp_t gfp)
-{
-	if (__memcg_kmem_bypass())
-		return cachep;
-	return __memcg_kmem_get_cache(cachep, gfp);
-}
-
-static __always_inline void memcg_kmem_put_cache(struct kmem_cache *cachep)
-{
-	if (memcg_kmem_enabled())
-		__memcg_kmem_put_cache(cachep);
-}
-#else
-#define for_each_memcg_cache_index(_idx)	\
-	for (; NULL; )
-
-static inline bool memcg_kmem_enabled(void)
-{
-	return false;
-}
-
-static inline bool memcg_kmem_online(struct mem_cgroup *memcg)
-{
-	return false;
-}
-
-static inline int memcg_kmem_charge(struct page *page, gfp_t gfp, int order)
-{
-	return 0;
-}
-
-static inline void memcg_kmem_uncharge(struct page *page, int order)
-{
-}
-
-static inline int memcg_cache_id(struct mem_cgroup *memcg)
-{
-	return -1;
-}
-
-static inline void memcg_get_cache_ids(void)
-{
-}
-
-static inline void memcg_put_cache_ids(void)
-{
-}
-
-static inline struct kmem_cache *
-memcg_kmem_get_cache(struct kmem_cache *cachep, gfp_t gfp)
-{
-	return cachep;
-}
-
-static inline void memcg_kmem_put_cache(struct kmem_cache *cachep)
-{
-}
-#endif /* CONFIG_MEMCG_KMEM */
 #endif /* _LINUX_MEMCONTROL_H */
diff --git a/include/linux/sched.h b/include/linux/sched.h
index edad7a4..62b5a6e 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1465,8 +1465,6 @@ struct task_struct {
 	unsigned sched_migrated:1;
 #ifdef CONFIG_MEMCG
 	unsigned memcg_may_oom:1;
-#endif
-#ifdef CONFIG_MEMCG_KMEM
 	unsigned memcg_kmem_skip_account:1;
 #endif
 #ifdef CONFIG_COMPAT_BRK
diff --git a/include/linux/slab.h b/include/linux/slab.h
index 3ffee74..b0a7034 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -86,7 +86,7 @@
 #else
 # define SLAB_FAILSLAB		0x00000000UL
 #endif
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
 # define SLAB_ACCOUNT		0x04000000UL	/* Account to memcg */
 #else
 # define SLAB_ACCOUNT		0x00000000UL
diff --git a/include/linux/slab_def.h b/include/linux/slab_def.h
index 33d0490..cf139d3 100644
--- a/include/linux/slab_def.h
+++ b/include/linux/slab_def.h
@@ -69,7 +69,8 @@ struct kmem_cache {
 	 */
 	int obj_offset;
 #endif /* CONFIG_DEBUG_SLAB */
-#ifdef CONFIG_MEMCG_KMEM
+
+#ifdef CONFIG_MEMCG
 	struct memcg_cache_params memcg_params;
 #endif
 
diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h
index 3388511..b7e57927 100644
--- a/include/linux/slub_def.h
+++ b/include/linux/slub_def.h
@@ -84,7 +84,7 @@ struct kmem_cache {
 #ifdef CONFIG_SYSFS
 	struct kobject kobj;	/* For sysfs */
 #endif
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
 	struct memcg_cache_params memcg_params;
 	int max_attr_size; /* for propagation, maximum size of a stored attr */
 #ifdef CONFIG_SYSFS
diff --git a/mm/list_lru.c b/mm/list_lru.c
index afc71ea..568267d 100644
--- a/mm/list_lru.c
+++ b/mm/list_lru.c
@@ -12,7 +12,7 @@
 #include <linux/mutex.h>
 #include <linux/memcontrol.h>
 
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
 static LIST_HEAD(list_lrus);
 static DEFINE_MUTEX(list_lrus_mutex);
 
@@ -37,9 +37,9 @@ static void list_lru_register(struct list_lru *lru)
 static void list_lru_unregister(struct list_lru *lru)
 {
 }
-#endif /* CONFIG_MEMCG_KMEM */
+#endif /* CONFIG_MEMCG */
 
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
 static inline bool list_lru_memcg_aware(struct list_lru *lru)
 {
 	/*
@@ -104,7 +104,7 @@ list_lru_from_kmem(struct list_lru_node *nlru, void *ptr)
 {
 	return &nlru->lru;
 }
-#endif /* CONFIG_MEMCG_KMEM */
+#endif /* CONFIG_MEMCG */
 
 bool list_lru_add(struct list_lru *lru, struct list_head *item)
 {
@@ -292,7 +292,7 @@ static void init_one_lru(struct list_lru_one *l)
 	l->nr_items = 0;
 }
 
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
 static void __memcg_destroy_list_lru_node(struct list_lru_memcg *memcg_lrus,
 					  int begin, int end)
 {
@@ -529,7 +529,7 @@ static int memcg_init_list_lru(struct list_lru *lru, bool memcg_aware)
 static void memcg_destroy_list_lru(struct list_lru *lru)
 {
 }
-#endif /* CONFIG_MEMCG_KMEM */
+#endif /* CONFIG_MEMCG */
 
 int __list_lru_init(struct list_lru *lru, bool memcg_aware,
 		    struct lock_class_key *key)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 55a3f07..ab72c47 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -297,7 +297,6 @@ static inline struct mem_cgroup *mem_cgroup_from_id(unsigned short id)
 	return mem_cgroup_from_css(css);
 }
 
-#ifdef CONFIG_MEMCG_KMEM
 /*
  * This will be the memcg's index in each cache's ->memcg_params.memcg_caches.
  * The main reason for not using cgroup id for this:
@@ -349,8 +348,6 @@ void memcg_put_cache_ids(void)
 DEFINE_STATIC_KEY_FALSE(memcg_kmem_enabled_key);
 EXPORT_SYMBOL(memcg_kmem_enabled_key);
 
-#endif /* CONFIG_MEMCG_KMEM */
-
 static struct mem_cgroup_per_zone *
 mem_cgroup_zone_zoneinfo(struct mem_cgroup *memcg, struct zone *zone)
 {
@@ -2182,7 +2179,6 @@ static void commit_charge(struct page *page, struct mem_cgroup *memcg,
 		unlock_page_lru(page, isolated);
 }
 
-#ifdef CONFIG_MEMCG_KMEM
 static int memcg_alloc_cache_id(void)
 {
 	int id, size;
@@ -2403,7 +2399,6 @@ void __memcg_kmem_uncharge(struct page *page, int order)
 	page->mem_cgroup = NULL;
 	css_put_many(&memcg->css, nr_pages);
 }
-#endif /* CONFIG_MEMCG_KMEM */
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 
@@ -2839,7 +2834,6 @@ static u64 mem_cgroup_read_u64(struct cgroup_subsys_state *css,
 	}
 }
 
-#ifdef CONFIG_MEMCG_KMEM
 static int memcg_online_kmem(struct mem_cgroup *memcg)
 {
 	int err = 0;
@@ -2887,24 +2881,6 @@ out:
 	return err;
 }
 
-static int memcg_update_kmem_limit(struct mem_cgroup *memcg,
-				   unsigned long limit)
-{
-	int ret;
-
-	mutex_lock(&memcg_limit_mutex);
-	/* Top-level cgroup doesn't propagate from root */
-	if (!memcg_kmem_online(memcg)) {
-		ret = memcg_online_kmem(memcg);
-		if (ret)
-			goto out;
-	}
-	ret = page_counter_limit(&memcg->kmem, limit);
-out:
-	mutex_unlock(&memcg_limit_mutex);
-	return ret;
-}
-
 static int memcg_propagate_kmem(struct mem_cgroup *memcg)
 {
 	int ret = 0;
@@ -2978,14 +2954,30 @@ static void memcg_free_kmem(struct mem_cgroup *memcg)
 		WARN_ON(page_counter_read(&memcg->kmem));
 	}
 }
-#else
+
+#ifdef CONFIG_MEMCG_KMEM
 static int memcg_update_kmem_limit(struct mem_cgroup *memcg,
 				   unsigned long limit)
 {
-	return -EINVAL;
+	int ret;
+
+	mutex_lock(&memcg_limit_mutex);
+	/* Top-level cgroup doesn't propagate from root */
+	if (!memcg_kmem_online(memcg)) {
+		ret = memcg_online_kmem(memcg);
+		if (ret)
+			goto out;
+	}
+	ret = page_counter_limit(&memcg->kmem, limit);
+out:
+	mutex_unlock(&memcg_limit_mutex);
+	return ret;
 }
-static void memcg_offline_kmem(struct mem_cgroup *memcg)
+#else
+static int memcg_update_kmem_limit(struct mem_cgroup *memcg,
+				   unsigned long limit)
 {
+	return -EINVAL;
 }
 #endif /* CONFIG_MEMCG_KMEM */
 
@@ -4160,9 +4152,7 @@ mem_cgroup_css_alloc(struct cgroup_subsys_state *parent_css)
 	vmpressure_init(&memcg->vmpressure);
 	INIT_LIST_HEAD(&memcg->event_list);
 	spin_lock_init(&memcg->event_list_lock);
-#ifdef CONFIG_MEMCG_KMEM
 	memcg->kmemcg_id = -1;
-#endif
 #ifdef CONFIG_CGROUP_WRITEBACK
 	INIT_LIST_HEAD(&memcg->cgwb_list);
 #endif
@@ -4222,10 +4212,11 @@ mem_cgroup_css_online(struct cgroup_subsys_state *css)
 	}
 	mutex_unlock(&memcg_create_mutex);
 
-#ifdef CONFIG_MEMCG_KMEM
 	ret = memcg_propagate_kmem(memcg);
 	if (ret)
 		return ret;
+
+#ifdef CONFIG_MEMCG_KMEM
 	ret = tcp_init_cgroup(memcg);
 	if (ret)
 		return ret;
@@ -4279,8 +4270,9 @@ static void mem_cgroup_css_free(struct cgroup_subsys_state *css)
 		static_branch_dec(&memcg_sockets_enabled_key);
 #endif
 
-#ifdef CONFIG_MEMCG_KMEM
 	memcg_free_kmem(memcg);
+
+#ifdef CONFIG_MEMCG_KMEM
 	tcp_destroy_cgroup(memcg);
 #endif
 
diff --git a/mm/slab.h b/mm/slab.h
index c63b869..5adec08 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -173,7 +173,7 @@ ssize_t slabinfo_write(struct file *file, const char __user *buffer,
 void __kmem_cache_free_bulk(struct kmem_cache *, size_t, void **);
 int __kmem_cache_alloc_bulk(struct kmem_cache *, gfp_t, size_t, void **);
 
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
 /*
  * Iterate over all memcg caches of the given root cache. The caller must hold
  * slab_mutex.
@@ -251,7 +251,7 @@ static __always_inline int memcg_charge_slab(struct page *page,
 
 extern void slab_init_memcg_params(struct kmem_cache *);
 
-#else /* !CONFIG_MEMCG_KMEM */
+#else /* !CONFIG_MEMCG */
 
 #define for_each_memcg_cache(iter, root) \
 	for ((void)(iter), (void)(root); 0; )
@@ -292,7 +292,7 @@ static inline int memcg_charge_slab(struct page *page, gfp_t gfp, int order,
 static inline void slab_init_memcg_params(struct kmem_cache *s)
 {
 }
-#endif /* CONFIG_MEMCG_KMEM */
+#endif /* CONFIG_MEMCG */
 
 static inline struct kmem_cache *cache_from_obj(struct kmem_cache *s, void *x)
 {
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 8c262e6..34103b8 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -128,7 +128,7 @@ int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t nr,
 	return i;
 }
 
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
 void slab_init_memcg_params(struct kmem_cache *s)
 {
 	s->memcg_params.is_root_cache = true;
@@ -221,7 +221,7 @@ static inline int init_memcg_params(struct kmem_cache *s,
 static inline void destroy_memcg_params(struct kmem_cache *s)
 {
 }
-#endif /* CONFIG_MEMCG_KMEM */
+#endif /* CONFIG_MEMCG */
 
 /*
  * Find a mergeable slab cache
@@ -477,7 +477,7 @@ static void release_caches(struct list_head *release, bool need_rcu_barrier)
 	}
 }
 
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
 /*
  * memcg_create_kmem_cache - Create a cache for a memory cgroup.
  * @memcg: The memory cgroup the new cache is for.
@@ -689,7 +689,7 @@ static inline int shutdown_memcg_caches(struct kmem_cache *s,
 {
 	return 0;
 }
-#endif /* CONFIG_MEMCG_KMEM */
+#endif /* CONFIG_MEMCG */
 
 void slab_kmem_cache_release(struct kmem_cache *s)
 {
@@ -1123,7 +1123,7 @@ static int slab_show(struct seq_file *m, void *p)
 	return 0;
 }
 
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
 int memcg_slab_show(struct seq_file *m, void *p)
 {
 	struct kmem_cache *s = list_entry(p, struct kmem_cache, list);
diff --git a/mm/slub.c b/mm/slub.c
index b21fd24..2e1355a 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -5207,7 +5207,7 @@ static ssize_t slab_attr_store(struct kobject *kobj,
 		return -EIO;
 
 	err = attribute->store(s, buf, len);
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
 	if (slab_state >= FULL && err >= 0 && is_root_cache(s)) {
 		struct kmem_cache *c;
 
@@ -5242,7 +5242,7 @@ static ssize_t slab_attr_store(struct kobject *kobj,
 
 static void memcg_propagate_slab_attrs(struct kmem_cache *s)
 {
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
 	int i;
 	char *buffer = NULL;
 	struct kmem_cache *root_cache;
@@ -5328,7 +5328,7 @@ static struct kset *slab_kset;
 
 static inline struct kset *cache_kset(struct kmem_cache *s)
 {
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
 	if (!is_root_cache(s))
 		return s->memcg_params.root_cache->memcg_kset;
 #endif
@@ -5405,7 +5405,7 @@ static int sysfs_slab_add(struct kmem_cache *s)
 	if (err)
 		goto out_del_kobj;
 
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
 	if (is_root_cache(s)) {
 		s->memcg_kset = kset_create_and_add("cgroup", NULL, &s->kobj);
 		if (!s->memcg_kset) {
@@ -5438,7 +5438,7 @@ void sysfs_slab_remove(struct kmem_cache *s)
 		 */
 		return;
 
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
 	kset_unregister(s->memcg_kset);
 #endif
 	kobject_uevent(&s->kobj, KOBJ_REMOVE);
-- 
2.6.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 7/8] mm: memcontrol: account "kmem" consumers in cgroup2 memory controller
  2015-12-08 18:34 ` Johannes Weiner
@ 2015-12-08 18:34   ` Johannes Weiner
  -1 siblings, 0 replies; 79+ messages in thread
From: Johannes Weiner @ 2015-12-08 18:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Michal Hocko, Vladimir Davydov, linux-mm, cgroups, linux-kernel,
	kernel-team

The original cgroup memory controller has an extension to account slab
memory (and other "kernel memory" consumers) in a separate "kmem"
counter, once the user set an explicit limit on that "kmem" pool.

However, this includes various consumers whose sizes are directly
linked to userspace activity. Accounting them as an optional "kmem"
extension is problematic for several reasons:

1. It leaves the main memory interface with incomplete semantics. A
   user who puts their workload into a cgroup and configures a memory
   limit does not expect us to leave holes in the containment as big
   as the dentry and inode cache, or the kernel stack pages.

2. If the limit set on this random historical subgroup of consumers is
   reached, subsequent allocations will fail even when the main memory
   pool available to the cgroup is not yet exhausted and/or has
   reclaimable memory in it.

3. Calling it 'kernel memory' is misleading. The dentry and inode
   caches are no more 'kernel' (or no less 'user') memory than the
   page cache itself. Treating these consumers as different classes is
   a historical implementation detail that should not leak to users.

So, in addition to page cache, anonymous memory, and network socket
memory, account the following memory consumers per default in the
cgroup2 memory controller:

     - threadinfo
     - task_struct
     - task_delay_info
     - pid
     - cred
     - mm_struct
     - vm_area_struct and vm_region (nommu)
     - anon_vma and anon_vma_chain
     - signal_struct
     - sighand_struct
     - fs_struct
     - files_struct
     - fdtable and fdtable->full_fds_bits
     - dentry and external_name
     - inode for all filesystems.

This should give us reasonable memory isolation for most common
workloads out of the box.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 mm/memcontrol.c | 18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index ab72c47..d048137 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2356,13 +2356,14 @@ int __memcg_kmem_charge_memcg(struct page *page, gfp_t gfp, int order,
 	if (!memcg_kmem_online(memcg))
 		return 0;
 
-	if (!page_counter_try_charge(&memcg->kmem, nr_pages, &counter))
-		return -ENOMEM;
-
 	ret = try_charge(memcg, gfp, nr_pages);
-	if (ret) {
-		page_counter_uncharge(&memcg->kmem, nr_pages);
+	if (ret)
 		return ret;
+
+	if (!cgroup_subsys_on_dfl(memory_cgrp_subsys) &&
+	    !page_counter_try_charge(&memcg->kmem, nr_pages, &counter)) {
+		cancel_charge(memcg, nr_pages);
+		return -ENOMEM;
 	}
 
 	page->mem_cgroup = memcg;
@@ -2391,7 +2392,9 @@ void __memcg_kmem_uncharge(struct page *page, int order)
 
 	VM_BUG_ON_PAGE(mem_cgroup_is_root(memcg), page);
 
-	page_counter_uncharge(&memcg->kmem, nr_pages);
+	if (!cgroup_subsys_on_dfl(memory_cgrp_subsys))
+		page_counter_uncharge(&memcg->kmem, nr_pages);
+
 	page_counter_uncharge(&memcg->memory, nr_pages);
 	if (do_memsw_account())
 		page_counter_uncharge(&memcg->memsw, nr_pages);
@@ -2895,7 +2898,8 @@ static int memcg_propagate_kmem(struct mem_cgroup *memcg)
 	 * onlined after this point, because it has at least one child
 	 * already.
 	 */
-	if (memcg_kmem_online(parent))
+	if (cgroup_subsys_on_dfl(memory_cgrp_subsys) ||
+	    memcg_kmem_online(parent))
 		ret = memcg_online_kmem(memcg);
 	mutex_unlock(&memcg_limit_mutex);
 	return ret;
-- 
2.6.3


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 7/8] mm: memcontrol: account "kmem" consumers in cgroup2 memory controller
@ 2015-12-08 18:34   ` Johannes Weiner
  0 siblings, 0 replies; 79+ messages in thread
From: Johannes Weiner @ 2015-12-08 18:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Michal Hocko, Vladimir Davydov, linux-mm, cgroups, linux-kernel,
	kernel-team

The original cgroup memory controller has an extension to account slab
memory (and other "kernel memory" consumers) in a separate "kmem"
counter, once the user set an explicit limit on that "kmem" pool.

However, this includes various consumers whose sizes are directly
linked to userspace activity. Accounting them as an optional "kmem"
extension is problematic for several reasons:

1. It leaves the main memory interface with incomplete semantics. A
   user who puts their workload into a cgroup and configures a memory
   limit does not expect us to leave holes in the containment as big
   as the dentry and inode cache, or the kernel stack pages.

2. If the limit set on this random historical subgroup of consumers is
   reached, subsequent allocations will fail even when the main memory
   pool available to the cgroup is not yet exhausted and/or has
   reclaimable memory in it.

3. Calling it 'kernel memory' is misleading. The dentry and inode
   caches are no more 'kernel' (or no less 'user') memory than the
   page cache itself. Treating these consumers as different classes is
   a historical implementation detail that should not leak to users.

So, in addition to page cache, anonymous memory, and network socket
memory, account the following memory consumers per default in the
cgroup2 memory controller:

     - threadinfo
     - task_struct
     - task_delay_info
     - pid
     - cred
     - mm_struct
     - vm_area_struct and vm_region (nommu)
     - anon_vma and anon_vma_chain
     - signal_struct
     - sighand_struct
     - fs_struct
     - files_struct
     - fdtable and fdtable->full_fds_bits
     - dentry and external_name
     - inode for all filesystems.

This should give us reasonable memory isolation for most common
workloads out of the box.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 mm/memcontrol.c | 18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index ab72c47..d048137 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2356,13 +2356,14 @@ int __memcg_kmem_charge_memcg(struct page *page, gfp_t gfp, int order,
 	if (!memcg_kmem_online(memcg))
 		return 0;
 
-	if (!page_counter_try_charge(&memcg->kmem, nr_pages, &counter))
-		return -ENOMEM;
-
 	ret = try_charge(memcg, gfp, nr_pages);
-	if (ret) {
-		page_counter_uncharge(&memcg->kmem, nr_pages);
+	if (ret)
 		return ret;
+
+	if (!cgroup_subsys_on_dfl(memory_cgrp_subsys) &&
+	    !page_counter_try_charge(&memcg->kmem, nr_pages, &counter)) {
+		cancel_charge(memcg, nr_pages);
+		return -ENOMEM;
 	}
 
 	page->mem_cgroup = memcg;
@@ -2391,7 +2392,9 @@ void __memcg_kmem_uncharge(struct page *page, int order)
 
 	VM_BUG_ON_PAGE(mem_cgroup_is_root(memcg), page);
 
-	page_counter_uncharge(&memcg->kmem, nr_pages);
+	if (!cgroup_subsys_on_dfl(memory_cgrp_subsys))
+		page_counter_uncharge(&memcg->kmem, nr_pages);
+
 	page_counter_uncharge(&memcg->memory, nr_pages);
 	if (do_memsw_account())
 		page_counter_uncharge(&memcg->memsw, nr_pages);
@@ -2895,7 +2898,8 @@ static int memcg_propagate_kmem(struct mem_cgroup *memcg)
 	 * onlined after this point, because it has at least one child
 	 * already.
 	 */
-	if (memcg_kmem_online(parent))
+	if (cgroup_subsys_on_dfl(memory_cgrp_subsys) ||
+	    memcg_kmem_online(parent))
 		ret = memcg_online_kmem(memcg);
 	mutex_unlock(&memcg_limit_mutex);
 	return ret;
-- 
2.6.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 8/8] mm: memcontrol: introduce CONFIG_MEMCG_LEGACY_KMEM
  2015-12-08 18:34 ` Johannes Weiner
@ 2015-12-08 18:34   ` Johannes Weiner
  -1 siblings, 0 replies; 79+ messages in thread
From: Johannes Weiner @ 2015-12-08 18:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Michal Hocko, Vladimir Davydov, linux-mm, cgroups, linux-kernel,
	kernel-team

Let the user know that CONFIG_MEMCG_KMEM does not apply to the cgroup2
interface. This also makes legacy-only code sections stand out better.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 include/linux/memcontrol.h |  4 ++--
 init/Kconfig               | 10 +++++++++-
 mm/memcontrol.c            | 16 ++++++++--------
 net/ipv4/Makefile          |  2 +-
 4 files changed, 20 insertions(+), 12 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 80f38da..c6a5ed2 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -233,7 +233,7 @@ struct mem_cgroup {
 	 */
 	struct mem_cgroup_stat_cpu __percpu *stat;
 
-#if defined(CONFIG_MEMCG_KMEM) && defined(CONFIG_INET)
+#if defined(CONFIG_MEMCG_LEGACY_KMEM) && defined(CONFIG_INET)
 	struct cg_proto tcp_mem;
 #endif
 
@@ -873,7 +873,7 @@ extern struct static_key_false memcg_sockets_enabled_key;
 #define mem_cgroup_sockets_enabled static_branch_unlikely(&memcg_sockets_enabled_key)
 static inline bool mem_cgroup_under_socket_pressure(struct mem_cgroup *memcg)
 {
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG_LEGACY_KMEM
 	if (memcg->tcp_mem.memory_pressure)
 		return true;
 #endif
diff --git a/init/Kconfig b/init/Kconfig
index f1af42d..e5e4971 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1040,10 +1040,13 @@ config MEMCG_SWAP_ENABLED
 	  For those who want to have the feature enabled by default should
 	  select this option (if, for some reason, they need to disable it
 	  then swapaccount=0 does the trick).
+config MEMCG_LEGACY_KMEM
+       bool
 config MEMCG_KMEM
-	bool "Memory Resource Controller Kernel Memory accounting"
+	bool "Legacy Memory Resource Controller Kernel Memory accounting"
 	depends on MEMCG
 	depends on SLUB || SLAB
+	select MEMCG_LEGACY_KMEM
 	help
 	  The Kernel Memory extension for Memory Resource Controller can limit
 	  the amount of memory used by kernel objects in the system. Those are
@@ -1052,6 +1055,11 @@ config MEMCG_KMEM
 	  the kmem extension can use it to guarantee that no group of processes
 	  will ever exhaust kernel resources alone.
 
+	  This option affects the ORIGINAL cgroup interface. The cgroup2 memory
+	  controller includes important in-kernel memory consumers per default.
+
+	  If you're using cgroup2, say N.
+
 config CGROUP_HUGETLB
 	bool "HugeTLB Resource Controller for Control Groups"
 	depends on HUGETLB_PAGE
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index d048137..c527767 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2959,7 +2959,7 @@ static void memcg_free_kmem(struct mem_cgroup *memcg)
 	}
 }
 
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG_LEGACY_KMEM
 static int memcg_update_kmem_limit(struct mem_cgroup *memcg,
 				   unsigned long limit)
 {
@@ -2983,7 +2983,7 @@ static int memcg_update_kmem_limit(struct mem_cgroup *memcg,
 {
 	return -EINVAL;
 }
-#endif /* CONFIG_MEMCG_KMEM */
+#endif /* CONFIG_MEMCG_LEGACY_KMEM */
 
 /*
  * The user of this function is...
@@ -3995,7 +3995,7 @@ static struct cftype mem_cgroup_legacy_files[] = {
 		.seq_show = memcg_numa_stat_show,
 	},
 #endif
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG_LEGACY_KMEM
 	{
 		.name = "kmem.limit_in_bytes",
 		.private = MEMFILE_PRIVATE(_KMEM, RES_LIMIT),
@@ -4220,7 +4220,7 @@ mem_cgroup_css_online(struct cgroup_subsys_state *css)
 	if (ret)
 		return ret;
 
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG_LEGACY_KMEM
 	ret = tcp_init_cgroup(memcg);
 	if (ret)
 		return ret;
@@ -4276,7 +4276,7 @@ static void mem_cgroup_css_free(struct cgroup_subsys_state *css)
 
 	memcg_free_kmem(memcg);
 
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG_LEGACY_KMEM
 	tcp_destroy_cgroup(memcg);
 #endif
 
@@ -5495,7 +5495,7 @@ void sock_update_memcg(struct sock *sk)
 	memcg = mem_cgroup_from_task(current);
 	if (memcg == root_mem_cgroup)
 		goto out;
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG_LEGACY_KMEM
 	if (!cgroup_subsys_on_dfl(memory_cgrp_subsys) && !memcg->tcp_mem.active)
 		goto out;
 #endif
@@ -5524,7 +5524,7 @@ bool mem_cgroup_charge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages)
 {
 	gfp_t gfp_mask = GFP_KERNEL;
 
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG_LEGACY_KMEM
 	if (!cgroup_subsys_on_dfl(memory_cgrp_subsys)) {
 		struct page_counter *counter;
 
@@ -5556,7 +5556,7 @@ bool mem_cgroup_charge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages)
  */
 void mem_cgroup_uncharge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages)
 {
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG_LEGACY_KMEM
 	if (!cgroup_subsys_on_dfl(memory_cgrp_subsys)) {
 		page_counter_uncharge(&memcg->tcp_mem.memory_allocated,
 				      nr_pages);
diff --git a/net/ipv4/Makefile b/net/ipv4/Makefile
index c29809f..bee5055 100644
--- a/net/ipv4/Makefile
+++ b/net/ipv4/Makefile
@@ -56,7 +56,7 @@ obj-$(CONFIG_TCP_CONG_SCALABLE) += tcp_scalable.o
 obj-$(CONFIG_TCP_CONG_LP) += tcp_lp.o
 obj-$(CONFIG_TCP_CONG_YEAH) += tcp_yeah.o
 obj-$(CONFIG_TCP_CONG_ILLINOIS) += tcp_illinois.o
-obj-$(CONFIG_MEMCG_KMEM) += tcp_memcontrol.o
+obj-$(CONFIG_MEMCG_LEGACY_KMEM) += tcp_memcontrol.o
 obj-$(CONFIG_NETLABEL) += cipso_ipv4.o
 
 obj-$(CONFIG_XFRM) += xfrm4_policy.o xfrm4_state.o xfrm4_input.o \
-- 
2.6.3


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 8/8] mm: memcontrol: introduce CONFIG_MEMCG_LEGACY_KMEM
@ 2015-12-08 18:34   ` Johannes Weiner
  0 siblings, 0 replies; 79+ messages in thread
From: Johannes Weiner @ 2015-12-08 18:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Michal Hocko, Vladimir Davydov, linux-mm, cgroups, linux-kernel,
	kernel-team

Let the user know that CONFIG_MEMCG_KMEM does not apply to the cgroup2
interface. This also makes legacy-only code sections stand out better.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 include/linux/memcontrol.h |  4 ++--
 init/Kconfig               | 10 +++++++++-
 mm/memcontrol.c            | 16 ++++++++--------
 net/ipv4/Makefile          |  2 +-
 4 files changed, 20 insertions(+), 12 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 80f38da..c6a5ed2 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -233,7 +233,7 @@ struct mem_cgroup {
 	 */
 	struct mem_cgroup_stat_cpu __percpu *stat;
 
-#if defined(CONFIG_MEMCG_KMEM) && defined(CONFIG_INET)
+#if defined(CONFIG_MEMCG_LEGACY_KMEM) && defined(CONFIG_INET)
 	struct cg_proto tcp_mem;
 #endif
 
@@ -873,7 +873,7 @@ extern struct static_key_false memcg_sockets_enabled_key;
 #define mem_cgroup_sockets_enabled static_branch_unlikely(&memcg_sockets_enabled_key)
 static inline bool mem_cgroup_under_socket_pressure(struct mem_cgroup *memcg)
 {
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG_LEGACY_KMEM
 	if (memcg->tcp_mem.memory_pressure)
 		return true;
 #endif
diff --git a/init/Kconfig b/init/Kconfig
index f1af42d..e5e4971 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1040,10 +1040,13 @@ config MEMCG_SWAP_ENABLED
 	  For those who want to have the feature enabled by default should
 	  select this option (if, for some reason, they need to disable it
 	  then swapaccount=0 does the trick).
+config MEMCG_LEGACY_KMEM
+       bool
 config MEMCG_KMEM
-	bool "Memory Resource Controller Kernel Memory accounting"
+	bool "Legacy Memory Resource Controller Kernel Memory accounting"
 	depends on MEMCG
 	depends on SLUB || SLAB
+	select MEMCG_LEGACY_KMEM
 	help
 	  The Kernel Memory extension for Memory Resource Controller can limit
 	  the amount of memory used by kernel objects in the system. Those are
@@ -1052,6 +1055,11 @@ config MEMCG_KMEM
 	  the kmem extension can use it to guarantee that no group of processes
 	  will ever exhaust kernel resources alone.
 
+	  This option affects the ORIGINAL cgroup interface. The cgroup2 memory
+	  controller includes important in-kernel memory consumers per default.
+
+	  If you're using cgroup2, say N.
+
 config CGROUP_HUGETLB
 	bool "HugeTLB Resource Controller for Control Groups"
 	depends on HUGETLB_PAGE
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index d048137..c527767 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2959,7 +2959,7 @@ static void memcg_free_kmem(struct mem_cgroup *memcg)
 	}
 }
 
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG_LEGACY_KMEM
 static int memcg_update_kmem_limit(struct mem_cgroup *memcg,
 				   unsigned long limit)
 {
@@ -2983,7 +2983,7 @@ static int memcg_update_kmem_limit(struct mem_cgroup *memcg,
 {
 	return -EINVAL;
 }
-#endif /* CONFIG_MEMCG_KMEM */
+#endif /* CONFIG_MEMCG_LEGACY_KMEM */
 
 /*
  * The user of this function is...
@@ -3995,7 +3995,7 @@ static struct cftype mem_cgroup_legacy_files[] = {
 		.seq_show = memcg_numa_stat_show,
 	},
 #endif
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG_LEGACY_KMEM
 	{
 		.name = "kmem.limit_in_bytes",
 		.private = MEMFILE_PRIVATE(_KMEM, RES_LIMIT),
@@ -4220,7 +4220,7 @@ mem_cgroup_css_online(struct cgroup_subsys_state *css)
 	if (ret)
 		return ret;
 
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG_LEGACY_KMEM
 	ret = tcp_init_cgroup(memcg);
 	if (ret)
 		return ret;
@@ -4276,7 +4276,7 @@ static void mem_cgroup_css_free(struct cgroup_subsys_state *css)
 
 	memcg_free_kmem(memcg);
 
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG_LEGACY_KMEM
 	tcp_destroy_cgroup(memcg);
 #endif
 
@@ -5495,7 +5495,7 @@ void sock_update_memcg(struct sock *sk)
 	memcg = mem_cgroup_from_task(current);
 	if (memcg == root_mem_cgroup)
 		goto out;
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG_LEGACY_KMEM
 	if (!cgroup_subsys_on_dfl(memory_cgrp_subsys) && !memcg->tcp_mem.active)
 		goto out;
 #endif
@@ -5524,7 +5524,7 @@ bool mem_cgroup_charge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages)
 {
 	gfp_t gfp_mask = GFP_KERNEL;
 
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG_LEGACY_KMEM
 	if (!cgroup_subsys_on_dfl(memory_cgrp_subsys)) {
 		struct page_counter *counter;
 
@@ -5556,7 +5556,7 @@ bool mem_cgroup_charge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages)
  */
 void mem_cgroup_uncharge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages)
 {
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG_LEGACY_KMEM
 	if (!cgroup_subsys_on_dfl(memory_cgrp_subsys)) {
 		page_counter_uncharge(&memcg->tcp_mem.memory_allocated,
 				      nr_pages);
diff --git a/net/ipv4/Makefile b/net/ipv4/Makefile
index c29809f..bee5055 100644
--- a/net/ipv4/Makefile
+++ b/net/ipv4/Makefile
@@ -56,7 +56,7 @@ obj-$(CONFIG_TCP_CONG_SCALABLE) += tcp_scalable.o
 obj-$(CONFIG_TCP_CONG_LP) += tcp_lp.o
 obj-$(CONFIG_TCP_CONG_YEAH) += tcp_yeah.o
 obj-$(CONFIG_TCP_CONG_ILLINOIS) += tcp_illinois.o
-obj-$(CONFIG_MEMCG_KMEM) += tcp_memcontrol.o
+obj-$(CONFIG_MEMCG_LEGACY_KMEM) += tcp_memcontrol.o
 obj-$(CONFIG_NETLABEL) += cipso_ipv4.o
 
 obj-$(CONFIG_XFRM) += xfrm4_policy.o xfrm4_state.o xfrm4_input.o \
-- 
2.6.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* Re: [PATCH 1/8] mm: memcontrol: drop unused @css argument in memcg_init_kmem
  2015-12-08 18:34   ` Johannes Weiner
  (?)
@ 2015-12-09  9:01     ` Vladimir Davydov
  -1 siblings, 0 replies; 79+ messages in thread
From: Vladimir Davydov @ 2015-12-09  9:01 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Michal Hocko, linux-mm, cgroups, linux-kernel,
	kernel-team

On Tue, Dec 08, 2015 at 01:34:18PM -0500, Johannes Weiner wrote:
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 1/8] mm: memcontrol: drop unused @css argument in memcg_init_kmem
@ 2015-12-09  9:01     ` Vladimir Davydov
  0 siblings, 0 replies; 79+ messages in thread
From: Vladimir Davydov @ 2015-12-09  9:01 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Michal Hocko, linux-mm, cgroups, linux-kernel,
	kernel-team

On Tue, Dec 08, 2015 at 01:34:18PM -0500, Johannes Weiner wrote:
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 1/8] mm: memcontrol: drop unused @css argument in memcg_init_kmem
@ 2015-12-09  9:01     ` Vladimir Davydov
  0 siblings, 0 replies; 79+ messages in thread
From: Vladimir Davydov @ 2015-12-09  9:01 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Michal Hocko, linux-mm, cgroups, linux-kernel,
	kernel-team

On Tue, Dec 08, 2015 at 01:34:18PM -0500, Johannes Weiner wrote:
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 2/8] mm: memcontrol: remove double kmem page_counter init
  2015-12-08 18:34   ` Johannes Weiner
@ 2015-12-09  9:05     ` Vladimir Davydov
  -1 siblings, 0 replies; 79+ messages in thread
From: Vladimir Davydov @ 2015-12-09  9:05 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Michal Hocko, linux-mm, cgroups, linux-kernel,
	kernel-team

On Tue, Dec 08, 2015 at 01:34:19PM -0500, Johannes Weiner wrote:
> The kmem page_counter's limit is initialized to PAGE_COUNTER_MAX
> inside mem_cgroup_css_online(). There is no need to repeat this
> from memcg_propagate_kmem().
> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 2/8] mm: memcontrol: remove double kmem page_counter init
@ 2015-12-09  9:05     ` Vladimir Davydov
  0 siblings, 0 replies; 79+ messages in thread
From: Vladimir Davydov @ 2015-12-09  9:05 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Michal Hocko, linux-mm, cgroups, linux-kernel,
	kernel-team

On Tue, Dec 08, 2015 at 01:34:19PM -0500, Johannes Weiner wrote:
> The kmem page_counter's limit is initialized to PAGE_COUNTER_MAX
> inside mem_cgroup_css_online(). There is no need to repeat this
> from memcg_propagate_kmem().
> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 3/8] mm: memcontrol: give the kmem states more descriptive names
  2015-12-08 18:34   ` Johannes Weiner
  (?)
@ 2015-12-09  9:10     ` Vladimir Davydov
  -1 siblings, 0 replies; 79+ messages in thread
From: Vladimir Davydov @ 2015-12-09  9:10 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Michal Hocko, linux-mm, cgroups, linux-kernel,
	kernel-team

On Tue, Dec 08, 2015 at 01:34:20PM -0500, Johannes Weiner wrote:
> On any given memcg, the kmem accounting feature has three separate
> states: not initialized, structures allocated, and actively accounting
> slab memory. These are represented through a combination of the
> kmem_acct_activated and kmem_acct_active flags, which is confusing.
> 
> Convert to a kmem_state enum with the states NONE, ALLOCATED, and
> ONLINE. Then rename the functions to modify the state accordingly.
> This follows the nomenclature of css object states more closely.
> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 3/8] mm: memcontrol: give the kmem states more descriptive names
@ 2015-12-09  9:10     ` Vladimir Davydov
  0 siblings, 0 replies; 79+ messages in thread
From: Vladimir Davydov @ 2015-12-09  9:10 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Michal Hocko, linux-mm, cgroups, linux-kernel,
	kernel-team

On Tue, Dec 08, 2015 at 01:34:20PM -0500, Johannes Weiner wrote:
> On any given memcg, the kmem accounting feature has three separate
> states: not initialized, structures allocated, and actively accounting
> slab memory. These are represented through a combination of the
> kmem_acct_activated and kmem_acct_active flags, which is confusing.
> 
> Convert to a kmem_state enum with the states NONE, ALLOCATED, and
> ONLINE. Then rename the functions to modify the state accordingly.
> This follows the nomenclature of css object states more closely.
> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 3/8] mm: memcontrol: give the kmem states more descriptive names
@ 2015-12-09  9:10     ` Vladimir Davydov
  0 siblings, 0 replies; 79+ messages in thread
From: Vladimir Davydov @ 2015-12-09  9:10 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Michal Hocko, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, kernel-team-b10kYP2dOMg

On Tue, Dec 08, 2015 at 01:34:20PM -0500, Johannes Weiner wrote:
> On any given memcg, the kmem accounting feature has three separate
> states: not initialized, structures allocated, and actively accounting
> slab memory. These are represented through a combination of the
> kmem_acct_activated and kmem_acct_active flags, which is confusing.
> 
> Convert to a kmem_state enum with the states NONE, ALLOCATED, and
> ONLINE. Then rename the functions to modify the state accordingly.
> This follows the nomenclature of css object states more closely.
> 
> Signed-off-by: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>

Acked-by: Vladimir Davydov <vdavydov-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 4/8] mm: memcontrol: group kmem init and exit functions together
  2015-12-08 18:34   ` Johannes Weiner
  (?)
@ 2015-12-09  9:14     ` Vladimir Davydov
  -1 siblings, 0 replies; 79+ messages in thread
From: Vladimir Davydov @ 2015-12-09  9:14 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Michal Hocko, linux-mm, cgroups, linux-kernel,
	kernel-team

On Tue, Dec 08, 2015 at 01:34:21PM -0500, Johannes Weiner wrote:
> Put all the related code to setup and teardown the kmem accounting
> state into the same location. No functional change intended.
> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 4/8] mm: memcontrol: group kmem init and exit functions together
@ 2015-12-09  9:14     ` Vladimir Davydov
  0 siblings, 0 replies; 79+ messages in thread
From: Vladimir Davydov @ 2015-12-09  9:14 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Michal Hocko, linux-mm, cgroups, linux-kernel,
	kernel-team

On Tue, Dec 08, 2015 at 01:34:21PM -0500, Johannes Weiner wrote:
> Put all the related code to setup and teardown the kmem accounting
> state into the same location. No functional change intended.
> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 4/8] mm: memcontrol: group kmem init and exit functions together
@ 2015-12-09  9:14     ` Vladimir Davydov
  0 siblings, 0 replies; 79+ messages in thread
From: Vladimir Davydov @ 2015-12-09  9:14 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Michal Hocko, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, kernel-team-b10kYP2dOMg

On Tue, Dec 08, 2015 at 01:34:21PM -0500, Johannes Weiner wrote:
> Put all the related code to setup and teardown the kmem accounting
> state into the same location. No functional change intended.
> 
> Signed-off-by: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>

Acked-by: Vladimir Davydov <vdavydov-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 5/8] mm: memcontrol: separate kmem code from legacy tcp accounting code
  2015-12-08 18:34   ` Johannes Weiner
  (?)
@ 2015-12-09  9:23     ` Vladimir Davydov
  -1 siblings, 0 replies; 79+ messages in thread
From: Vladimir Davydov @ 2015-12-09  9:23 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Michal Hocko, linux-mm, cgroups, linux-kernel,
	kernel-team

On Tue, Dec 08, 2015 at 01:34:22PM -0500, Johannes Weiner wrote:
> The cgroup2 memory controller will include important in-kernel memory
> consumers per default, including socket memory, but it will no longer
> carry the historic tcp control interface.
> 
> Separate the kmem state init from the tcp control interface init in
> preparation for that.
> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 5/8] mm: memcontrol: separate kmem code from legacy tcp accounting code
@ 2015-12-09  9:23     ` Vladimir Davydov
  0 siblings, 0 replies; 79+ messages in thread
From: Vladimir Davydov @ 2015-12-09  9:23 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Michal Hocko, linux-mm, cgroups, linux-kernel,
	kernel-team

On Tue, Dec 08, 2015 at 01:34:22PM -0500, Johannes Weiner wrote:
> The cgroup2 memory controller will include important in-kernel memory
> consumers per default, including socket memory, but it will no longer
> carry the historic tcp control interface.
> 
> Separate the kmem state init from the tcp control interface init in
> preparation for that.
> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 5/8] mm: memcontrol: separate kmem code from legacy tcp accounting code
@ 2015-12-09  9:23     ` Vladimir Davydov
  0 siblings, 0 replies; 79+ messages in thread
From: Vladimir Davydov @ 2015-12-09  9:23 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Michal Hocko, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, kernel-team-b10kYP2dOMg

On Tue, Dec 08, 2015 at 01:34:22PM -0500, Johannes Weiner wrote:
> The cgroup2 memory controller will include important in-kernel memory
> consumers per default, including socket memory, but it will no longer
> carry the historic tcp control interface.
> 
> Separate the kmem state init from the tcp control interface init in
> preparation for that.
> 
> Signed-off-by: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>

Acked-by: Vladimir Davydov <vdavydov-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 6/8] mm: memcontrol: move kmem accounting code to CONFIG_MEMCG
  2015-12-08 18:34   ` Johannes Weiner
  (?)
@ 2015-12-09  9:32     ` Vladimir Davydov
  -1 siblings, 0 replies; 79+ messages in thread
From: Vladimir Davydov @ 2015-12-09  9:32 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Michal Hocko, linux-mm, cgroups, linux-kernel,
	kernel-team

On Tue, Dec 08, 2015 at 01:34:23PM -0500, Johannes Weiner wrote:
> The cgroup2 memory controller will account important in-kernel memory
> consumers per default. Move all necessary components to CONFIG_MEMCG.
> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 6/8] mm: memcontrol: move kmem accounting code to CONFIG_MEMCG
@ 2015-12-09  9:32     ` Vladimir Davydov
  0 siblings, 0 replies; 79+ messages in thread
From: Vladimir Davydov @ 2015-12-09  9:32 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Michal Hocko, linux-mm, cgroups, linux-kernel,
	kernel-team

On Tue, Dec 08, 2015 at 01:34:23PM -0500, Johannes Weiner wrote:
> The cgroup2 memory controller will account important in-kernel memory
> consumers per default. Move all necessary components to CONFIG_MEMCG.
> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 6/8] mm: memcontrol: move kmem accounting code to CONFIG_MEMCG
@ 2015-12-09  9:32     ` Vladimir Davydov
  0 siblings, 0 replies; 79+ messages in thread
From: Vladimir Davydov @ 2015-12-09  9:32 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Michal Hocko, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, kernel-team-b10kYP2dOMg

On Tue, Dec 08, 2015 at 01:34:23PM -0500, Johannes Weiner wrote:
> The cgroup2 memory controller will account important in-kernel memory
> consumers per default. Move all necessary components to CONFIG_MEMCG.
> 
> Signed-off-by: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>

Acked-by: Vladimir Davydov <vdavydov-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 7/8] mm: memcontrol: account "kmem" consumers in cgroup2 memory controller
  2015-12-08 18:34   ` Johannes Weiner
  (?)
@ 2015-12-09 11:30     ` Vladimir Davydov
  -1 siblings, 0 replies; 79+ messages in thread
From: Vladimir Davydov @ 2015-12-09 11:30 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Michal Hocko, linux-mm, cgroups, linux-kernel,
	kernel-team

On Tue, Dec 08, 2015 at 01:34:24PM -0500, Johannes Weiner wrote:
> The original cgroup memory controller has an extension to account slab
> memory (and other "kernel memory" consumers) in a separate "kmem"
> counter, once the user set an explicit limit on that "kmem" pool.
> 
> However, this includes various consumers whose sizes are directly
> linked to userspace activity. Accounting them as an optional "kmem"
> extension is problematic for several reasons:
> 
> 1. It leaves the main memory interface with incomplete semantics. A
>    user who puts their workload into a cgroup and configures a memory
>    limit does not expect us to leave holes in the containment as big
>    as the dentry and inode cache, or the kernel stack pages.
> 
> 2. If the limit set on this random historical subgroup of consumers is
>    reached, subsequent allocations will fail even when the main memory
>    pool available to the cgroup is not yet exhausted and/or has
>    reclaimable memory in it.
> 
> 3. Calling it 'kernel memory' is misleading. The dentry and inode
>    caches are no more 'kernel' (or no less 'user') memory than the
>    page cache itself. Treating these consumers as different classes is
>    a historical implementation detail that should not leak to users.
> 
> So, in addition to page cache, anonymous memory, and network socket
> memory, account the following memory consumers per default in the
> cgroup2 memory controller:
> 
>      - threadinfo
>      - task_struct
>      - task_delay_info
>      - pid
>      - cred
>      - mm_struct
>      - vm_area_struct and vm_region (nommu)
>      - anon_vma and anon_vma_chain
>      - signal_struct
>      - sighand_struct
>      - fs_struct
>      - files_struct
>      - fdtable and fdtable->full_fds_bits
>      - dentry and external_name
>      - inode for all filesystems.
> 
> This should give us reasonable memory isolation for most common
> workloads out of the box.
> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com>

The patch looks good to me, but I think we still need to add a boot-time
knob to disable kmem accounting, as we do for sockets:

From: Vladimir Davydov <vdavydov@virtuozzo.com>
Subject: [PATCH] mm: memcontrol: allow to disable kmem accounting for cgroup2

Kmem accounting might incur overhead that some users can't put up with.
Besides, the implementation is still considered unstable. So let's
provide a way to disable it for those users who aren't happy with it.

To disable kmem accounting for cgroup2, pass cgroup.memory=nokmem at
boot time.

Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index c1bda3bbb7db..1b7a85dc6013 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -602,6 +602,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
 	cgroup.memory=	[KNL] Pass options to the cgroup memory controller.
 			Format: <string>
 			nosocket -- Disable socket memory accounting.
+			nokmem -- Disable kernel memory accounting.
 
 	checkreqprot	[SELINUX] Set initial checkreqprot flag value.
 			Format: { "0" | "1" }
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 6faea81e66d7..6a5572241dc6 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -83,6 +83,9 @@ struct mem_cgroup *root_mem_cgroup __read_mostly;
 /* Socket memory accounting disabled? */
 static bool cgroup_memory_nosocket;
 
+/* Kernel memory accounting disabled? */
+static bool cgroup_memory_nokmem;
+
 /* Whether the swap controller is active */
 #ifdef CONFIG_MEMCG_SWAP
 int do_swap_account __read_mostly;
@@ -2898,8 +2901,8 @@ static int memcg_propagate_kmem(struct mem_cgroup *memcg)
 	 * onlined after this point, because it has at least one child
 	 * already.
 	 */
-	if (cgroup_subsys_on_dfl(memory_cgrp_subsys) ||
-	    memcg_kmem_online(parent))
+	if (memcg_kmem_online(parent) ||
+	    (cgroup_subsys_on_dfl(memory_cgrp_subsys) && !cgroup_memory_nokmem))
 		ret = memcg_online_kmem(memcg);
 	mutex_unlock(&memcg_limit_mutex);
 	return ret;
@@ -5587,6 +5590,8 @@ static int __init cgroup_memory(char *s)
 			continue;
 		if (!strcmp(token, "nosocket"))
 			cgroup_memory_nosocket = true;
+		if (!strcmp(token, "nokmem"))
+			cgroup_memory_nokmem = true;
 	}
 	return 0;
 }

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* Re: [PATCH 7/8] mm: memcontrol: account "kmem" consumers in cgroup2 memory controller
@ 2015-12-09 11:30     ` Vladimir Davydov
  0 siblings, 0 replies; 79+ messages in thread
From: Vladimir Davydov @ 2015-12-09 11:30 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Michal Hocko, linux-mm, cgroups, linux-kernel,
	kernel-team

On Tue, Dec 08, 2015 at 01:34:24PM -0500, Johannes Weiner wrote:
> The original cgroup memory controller has an extension to account slab
> memory (and other "kernel memory" consumers) in a separate "kmem"
> counter, once the user set an explicit limit on that "kmem" pool.
> 
> However, this includes various consumers whose sizes are directly
> linked to userspace activity. Accounting them as an optional "kmem"
> extension is problematic for several reasons:
> 
> 1. It leaves the main memory interface with incomplete semantics. A
>    user who puts their workload into a cgroup and configures a memory
>    limit does not expect us to leave holes in the containment as big
>    as the dentry and inode cache, or the kernel stack pages.
> 
> 2. If the limit set on this random historical subgroup of consumers is
>    reached, subsequent allocations will fail even when the main memory
>    pool available to the cgroup is not yet exhausted and/or has
>    reclaimable memory in it.
> 
> 3. Calling it 'kernel memory' is misleading. The dentry and inode
>    caches are no more 'kernel' (or no less 'user') memory than the
>    page cache itself. Treating these consumers as different classes is
>    a historical implementation detail that should not leak to users.
> 
> So, in addition to page cache, anonymous memory, and network socket
> memory, account the following memory consumers per default in the
> cgroup2 memory controller:
> 
>      - threadinfo
>      - task_struct
>      - task_delay_info
>      - pid
>      - cred
>      - mm_struct
>      - vm_area_struct and vm_region (nommu)
>      - anon_vma and anon_vma_chain
>      - signal_struct
>      - sighand_struct
>      - fs_struct
>      - files_struct
>      - fdtable and fdtable->full_fds_bits
>      - dentry and external_name
>      - inode for all filesystems.
> 
> This should give us reasonable memory isolation for most common
> workloads out of the box.
> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com>

The patch looks good to me, but I think we still need to add a boot-time
knob to disable kmem accounting, as we do for sockets:

From: Vladimir Davydov <vdavydov@virtuozzo.com>
Subject: [PATCH] mm: memcontrol: allow to disable kmem accounting for cgroup2

Kmem accounting might incur overhead that some users can't put up with.
Besides, the implementation is still considered unstable. So let's
provide a way to disable it for those users who aren't happy with it.

To disable kmem accounting for cgroup2, pass cgroup.memory=nokmem at
boot time.

Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index c1bda3bbb7db..1b7a85dc6013 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -602,6 +602,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
 	cgroup.memory=	[KNL] Pass options to the cgroup memory controller.
 			Format: <string>
 			nosocket -- Disable socket memory accounting.
+			nokmem -- Disable kernel memory accounting.
 
 	checkreqprot	[SELINUX] Set initial checkreqprot flag value.
 			Format: { "0" | "1" }
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 6faea81e66d7..6a5572241dc6 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -83,6 +83,9 @@ struct mem_cgroup *root_mem_cgroup __read_mostly;
 /* Socket memory accounting disabled? */
 static bool cgroup_memory_nosocket;
 
+/* Kernel memory accounting disabled? */
+static bool cgroup_memory_nokmem;
+
 /* Whether the swap controller is active */
 #ifdef CONFIG_MEMCG_SWAP
 int do_swap_account __read_mostly;
@@ -2898,8 +2901,8 @@ static int memcg_propagate_kmem(struct mem_cgroup *memcg)
 	 * onlined after this point, because it has at least one child
 	 * already.
 	 */
-	if (cgroup_subsys_on_dfl(memory_cgrp_subsys) ||
-	    memcg_kmem_online(parent))
+	if (memcg_kmem_online(parent) ||
+	    (cgroup_subsys_on_dfl(memory_cgrp_subsys) && !cgroup_memory_nokmem))
 		ret = memcg_online_kmem(memcg);
 	mutex_unlock(&memcg_limit_mutex);
 	return ret;
@@ -5587,6 +5590,8 @@ static int __init cgroup_memory(char *s)
 			continue;
 		if (!strcmp(token, "nosocket"))
 			cgroup_memory_nosocket = true;
+		if (!strcmp(token, "nokmem"))
+			cgroup_memory_nokmem = true;
 	}
 	return 0;
 }

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* Re: [PATCH 7/8] mm: memcontrol: account "kmem" consumers in cgroup2 memory controller
@ 2015-12-09 11:30     ` Vladimir Davydov
  0 siblings, 0 replies; 79+ messages in thread
From: Vladimir Davydov @ 2015-12-09 11:30 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Michal Hocko, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, kernel-team-b10kYP2dOMg

On Tue, Dec 08, 2015 at 01:34:24PM -0500, Johannes Weiner wrote:
> The original cgroup memory controller has an extension to account slab
> memory (and other "kernel memory" consumers) in a separate "kmem"
> counter, once the user set an explicit limit on that "kmem" pool.
> 
> However, this includes various consumers whose sizes are directly
> linked to userspace activity. Accounting them as an optional "kmem"
> extension is problematic for several reasons:
> 
> 1. It leaves the main memory interface with incomplete semantics. A
>    user who puts their workload into a cgroup and configures a memory
>    limit does not expect us to leave holes in the containment as big
>    as the dentry and inode cache, or the kernel stack pages.
> 
> 2. If the limit set on this random historical subgroup of consumers is
>    reached, subsequent allocations will fail even when the main memory
>    pool available to the cgroup is not yet exhausted and/or has
>    reclaimable memory in it.
> 
> 3. Calling it 'kernel memory' is misleading. The dentry and inode
>    caches are no more 'kernel' (or no less 'user') memory than the
>    page cache itself. Treating these consumers as different classes is
>    a historical implementation detail that should not leak to users.
> 
> So, in addition to page cache, anonymous memory, and network socket
> memory, account the following memory consumers per default in the
> cgroup2 memory controller:
> 
>      - threadinfo
>      - task_struct
>      - task_delay_info
>      - pid
>      - cred
>      - mm_struct
>      - vm_area_struct and vm_region (nommu)
>      - anon_vma and anon_vma_chain
>      - signal_struct
>      - sighand_struct
>      - fs_struct
>      - files_struct
>      - fdtable and fdtable->full_fds_bits
>      - dentry and external_name
>      - inode for all filesystems.
> 
> This should give us reasonable memory isolation for most common
> workloads out of the box.
> 
> Signed-off-by: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>

Acked-by: Vladimir Davydov <vdavydov-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>

The patch looks good to me, but I think we still need to add a boot-time
knob to disable kmem accounting, as we do for sockets:

From: Vladimir Davydov <vdavydov-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>
Subject: [PATCH] mm: memcontrol: allow to disable kmem accounting for cgroup2

Kmem accounting might incur overhead that some users can't put up with.
Besides, the implementation is still considered unstable. So let's
provide a way to disable it for those users who aren't happy with it.

To disable kmem accounting for cgroup2, pass cgroup.memory=nokmem at
boot time.

Signed-off-by: Vladimir Davydov <vdavydov-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index c1bda3bbb7db..1b7a85dc6013 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -602,6 +602,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
 	cgroup.memory=	[KNL] Pass options to the cgroup memory controller.
 			Format: <string>
 			nosocket -- Disable socket memory accounting.
+			nokmem -- Disable kernel memory accounting.
 
 	checkreqprot	[SELINUX] Set initial checkreqprot flag value.
 			Format: { "0" | "1" }
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 6faea81e66d7..6a5572241dc6 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -83,6 +83,9 @@ struct mem_cgroup *root_mem_cgroup __read_mostly;
 /* Socket memory accounting disabled? */
 static bool cgroup_memory_nosocket;
 
+/* Kernel memory accounting disabled? */
+static bool cgroup_memory_nokmem;
+
 /* Whether the swap controller is active */
 #ifdef CONFIG_MEMCG_SWAP
 int do_swap_account __read_mostly;
@@ -2898,8 +2901,8 @@ static int memcg_propagate_kmem(struct mem_cgroup *memcg)
 	 * onlined after this point, because it has at least one child
 	 * already.
 	 */
-	if (cgroup_subsys_on_dfl(memory_cgrp_subsys) ||
-	    memcg_kmem_online(parent))
+	if (memcg_kmem_online(parent) ||
+	    (cgroup_subsys_on_dfl(memory_cgrp_subsys) && !cgroup_memory_nokmem))
 		ret = memcg_online_kmem(memcg);
 	mutex_unlock(&memcg_limit_mutex);
 	return ret;
@@ -5587,6 +5590,8 @@ static int __init cgroup_memory(char *s)
 			continue;
 		if (!strcmp(token, "nosocket"))
 			cgroup_memory_nosocket = true;
+		if (!strcmp(token, "nokmem"))
+			cgroup_memory_nokmem = true;
 	}
 	return 0;
 }

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* Re: [PATCH 8/8] mm: memcontrol: introduce CONFIG_MEMCG_LEGACY_KMEM
  2015-12-08 18:34   ` Johannes Weiner
  (?)
@ 2015-12-09 11:31     ` Vladimir Davydov
  -1 siblings, 0 replies; 79+ messages in thread
From: Vladimir Davydov @ 2015-12-09 11:31 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Michal Hocko, linux-mm, cgroups, linux-kernel,
	kernel-team

On Tue, Dec 08, 2015 at 01:34:25PM -0500, Johannes Weiner wrote:
> Let the user know that CONFIG_MEMCG_KMEM does not apply to the cgroup2
> interface. This also makes legacy-only code sections stand out better.
> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 8/8] mm: memcontrol: introduce CONFIG_MEMCG_LEGACY_KMEM
@ 2015-12-09 11:31     ` Vladimir Davydov
  0 siblings, 0 replies; 79+ messages in thread
From: Vladimir Davydov @ 2015-12-09 11:31 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Michal Hocko, linux-mm, cgroups, linux-kernel,
	kernel-team

On Tue, Dec 08, 2015 at 01:34:25PM -0500, Johannes Weiner wrote:
> Let the user know that CONFIG_MEMCG_KMEM does not apply to the cgroup2
> interface. This also makes legacy-only code sections stand out better.
> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 8/8] mm: memcontrol: introduce CONFIG_MEMCG_LEGACY_KMEM
@ 2015-12-09 11:31     ` Vladimir Davydov
  0 siblings, 0 replies; 79+ messages in thread
From: Vladimir Davydov @ 2015-12-09 11:31 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Michal Hocko, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, kernel-team-b10kYP2dOMg

On Tue, Dec 08, 2015 at 01:34:25PM -0500, Johannes Weiner wrote:
> Let the user know that CONFIG_MEMCG_KMEM does not apply to the cgroup2
> interface. This also makes legacy-only code sections stand out better.
> 
> Signed-off-by: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>

Acked-by: Vladimir Davydov <vdavydov-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 7/8] mm: memcontrol: account "kmem" consumers in cgroup2 memory controller
  2015-12-09 11:30     ` Vladimir Davydov
  (?)
@ 2015-12-09 14:32       ` Johannes Weiner
  -1 siblings, 0 replies; 79+ messages in thread
From: Johannes Weiner @ 2015-12-09 14:32 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: Andrew Morton, Michal Hocko, linux-mm, cgroups, linux-kernel,
	kernel-team

On Wed, Dec 09, 2015 at 02:30:38PM +0300, Vladimir Davydov wrote:
> On Tue, Dec 08, 2015 at 01:34:24PM -0500, Johannes Weiner wrote:
> > The original cgroup memory controller has an extension to account slab
> > memory (and other "kernel memory" consumers) in a separate "kmem"
> > counter, once the user set an explicit limit on that "kmem" pool.
> > 
> > However, this includes various consumers whose sizes are directly
> > linked to userspace activity. Accounting them as an optional "kmem"
> > extension is problematic for several reasons:
> > 
> > 1. It leaves the main memory interface with incomplete semantics. A
> >    user who puts their workload into a cgroup and configures a memory
> >    limit does not expect us to leave holes in the containment as big
> >    as the dentry and inode cache, or the kernel stack pages.
> > 
> > 2. If the limit set on this random historical subgroup of consumers is
> >    reached, subsequent allocations will fail even when the main memory
> >    pool available to the cgroup is not yet exhausted and/or has
> >    reclaimable memory in it.
> > 
> > 3. Calling it 'kernel memory' is misleading. The dentry and inode
> >    caches are no more 'kernel' (or no less 'user') memory than the
> >    page cache itself. Treating these consumers as different classes is
> >    a historical implementation detail that should not leak to users.
> > 
> > So, in addition to page cache, anonymous memory, and network socket
> > memory, account the following memory consumers per default in the
> > cgroup2 memory controller:
> > 
> >      - threadinfo
> >      - task_struct
> >      - task_delay_info
> >      - pid
> >      - cred
> >      - mm_struct
> >      - vm_area_struct and vm_region (nommu)
> >      - anon_vma and anon_vma_chain
> >      - signal_struct
> >      - sighand_struct
> >      - fs_struct
> >      - files_struct
> >      - fdtable and fdtable->full_fds_bits
> >      - dentry and external_name
> >      - inode for all filesystems.
> > 
> > This should give us reasonable memory isolation for most common
> > workloads out of the box.
> > 
> > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> 
> Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com>

Thank you!

> The patch looks good to me, but I think we still need to add a boot-time
> knob to disable kmem accounting, as we do for sockets:
> 
> From: Vladimir Davydov <vdavydov@virtuozzo.com>
> Subject: [PATCH] mm: memcontrol: allow to disable kmem accounting for cgroup2
> 
> Kmem accounting might incur overhead that some users can't put up with.
> Besides, the implementation is still considered unstable. So let's
> provide a way to disable it for those users who aren't happy with it.
> 
> To disable kmem accounting for cgroup2, pass cgroup.memory=nokmem at
> boot time.
> 
> Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

Especially in the early release phases, there might be birthing pain
that users in the field would want to work around. And I'd rather they
can selectively disable problematic parts during the transition than
switching back wholesale to the old cgroup interface.

For me that would be the prime reason: a temporary workaround for
legacy users until we get our stuff sorted out. Unacceptable overhead
or instability would be something we would have to address anyway.
And then it's fine too that the flag continues to use the historic
misnomer "kmem".

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 7/8] mm: memcontrol: account "kmem" consumers in cgroup2 memory controller
@ 2015-12-09 14:32       ` Johannes Weiner
  0 siblings, 0 replies; 79+ messages in thread
From: Johannes Weiner @ 2015-12-09 14:32 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: Andrew Morton, Michal Hocko, linux-mm, cgroups, linux-kernel,
	kernel-team

On Wed, Dec 09, 2015 at 02:30:38PM +0300, Vladimir Davydov wrote:
> On Tue, Dec 08, 2015 at 01:34:24PM -0500, Johannes Weiner wrote:
> > The original cgroup memory controller has an extension to account slab
> > memory (and other "kernel memory" consumers) in a separate "kmem"
> > counter, once the user set an explicit limit on that "kmem" pool.
> > 
> > However, this includes various consumers whose sizes are directly
> > linked to userspace activity. Accounting them as an optional "kmem"
> > extension is problematic for several reasons:
> > 
> > 1. It leaves the main memory interface with incomplete semantics. A
> >    user who puts their workload into a cgroup and configures a memory
> >    limit does not expect us to leave holes in the containment as big
> >    as the dentry and inode cache, or the kernel stack pages.
> > 
> > 2. If the limit set on this random historical subgroup of consumers is
> >    reached, subsequent allocations will fail even when the main memory
> >    pool available to the cgroup is not yet exhausted and/or has
> >    reclaimable memory in it.
> > 
> > 3. Calling it 'kernel memory' is misleading. The dentry and inode
> >    caches are no more 'kernel' (or no less 'user') memory than the
> >    page cache itself. Treating these consumers as different classes is
> >    a historical implementation detail that should not leak to users.
> > 
> > So, in addition to page cache, anonymous memory, and network socket
> > memory, account the following memory consumers per default in the
> > cgroup2 memory controller:
> > 
> >      - threadinfo
> >      - task_struct
> >      - task_delay_info
> >      - pid
> >      - cred
> >      - mm_struct
> >      - vm_area_struct and vm_region (nommu)
> >      - anon_vma and anon_vma_chain
> >      - signal_struct
> >      - sighand_struct
> >      - fs_struct
> >      - files_struct
> >      - fdtable and fdtable->full_fds_bits
> >      - dentry and external_name
> >      - inode for all filesystems.
> > 
> > This should give us reasonable memory isolation for most common
> > workloads out of the box.
> > 
> > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> 
> Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com>

Thank you!

> The patch looks good to me, but I think we still need to add a boot-time
> knob to disable kmem accounting, as we do for sockets:
> 
> From: Vladimir Davydov <vdavydov@virtuozzo.com>
> Subject: [PATCH] mm: memcontrol: allow to disable kmem accounting for cgroup2
> 
> Kmem accounting might incur overhead that some users can't put up with.
> Besides, the implementation is still considered unstable. So let's
> provide a way to disable it for those users who aren't happy with it.
> 
> To disable kmem accounting for cgroup2, pass cgroup.memory=nokmem at
> boot time.
> 
> Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

Especially in the early release phases, there might be birthing pain
that users in the field would want to work around. And I'd rather they
can selectively disable problematic parts during the transition than
switching back wholesale to the old cgroup interface.

For me that would be the prime reason: a temporary workaround for
legacy users until we get our stuff sorted out. Unacceptable overhead
or instability would be something we would have to address anyway.
And then it's fine too that the flag continues to use the historic
misnomer "kmem".

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 7/8] mm: memcontrol: account "kmem" consumers in cgroup2 memory controller
@ 2015-12-09 14:32       ` Johannes Weiner
  0 siblings, 0 replies; 79+ messages in thread
From: Johannes Weiner @ 2015-12-09 14:32 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: Andrew Morton, Michal Hocko, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, kernel-team-b10kYP2dOMg

On Wed, Dec 09, 2015 at 02:30:38PM +0300, Vladimir Davydov wrote:
> On Tue, Dec 08, 2015 at 01:34:24PM -0500, Johannes Weiner wrote:
> > The original cgroup memory controller has an extension to account slab
> > memory (and other "kernel memory" consumers) in a separate "kmem"
> > counter, once the user set an explicit limit on that "kmem" pool.
> > 
> > However, this includes various consumers whose sizes are directly
> > linked to userspace activity. Accounting them as an optional "kmem"
> > extension is problematic for several reasons:
> > 
> > 1. It leaves the main memory interface with incomplete semantics. A
> >    user who puts their workload into a cgroup and configures a memory
> >    limit does not expect us to leave holes in the containment as big
> >    as the dentry and inode cache, or the kernel stack pages.
> > 
> > 2. If the limit set on this random historical subgroup of consumers is
> >    reached, subsequent allocations will fail even when the main memory
> >    pool available to the cgroup is not yet exhausted and/or has
> >    reclaimable memory in it.
> > 
> > 3. Calling it 'kernel memory' is misleading. The dentry and inode
> >    caches are no more 'kernel' (or no less 'user') memory than the
> >    page cache itself. Treating these consumers as different classes is
> >    a historical implementation detail that should not leak to users.
> > 
> > So, in addition to page cache, anonymous memory, and network socket
> > memory, account the following memory consumers per default in the
> > cgroup2 memory controller:
> > 
> >      - threadinfo
> >      - task_struct
> >      - task_delay_info
> >      - pid
> >      - cred
> >      - mm_struct
> >      - vm_area_struct and vm_region (nommu)
> >      - anon_vma and anon_vma_chain
> >      - signal_struct
> >      - sighand_struct
> >      - fs_struct
> >      - files_struct
> >      - fdtable and fdtable->full_fds_bits
> >      - dentry and external_name
> >      - inode for all filesystems.
> > 
> > This should give us reasonable memory isolation for most common
> > workloads out of the box.
> > 
> > Signed-off-by: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
> 
> Acked-by: Vladimir Davydov <vdavydov-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>

Thank you!

> The patch looks good to me, but I think we still need to add a boot-time
> knob to disable kmem accounting, as we do for sockets:
> 
> From: Vladimir Davydov <vdavydov-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>
> Subject: [PATCH] mm: memcontrol: allow to disable kmem accounting for cgroup2
> 
> Kmem accounting might incur overhead that some users can't put up with.
> Besides, the implementation is still considered unstable. So let's
> provide a way to disable it for those users who aren't happy with it.
> 
> To disable kmem accounting for cgroup2, pass cgroup.memory=nokmem at
> boot time.
> 
> Signed-off-by: Vladimir Davydov <vdavydov-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>

Acked-by: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>

Especially in the early release phases, there might be birthing pain
that users in the field would want to work around. And I'd rather they
can selectively disable problematic parts during the transition than
switching back wholesale to the old cgroup interface.

For me that would be the prime reason: a temporary workaround for
legacy users until we get our stuff sorted out. Unacceptable overhead
or instability would be something we would have to address anyway.
And then it's fine too that the flag continues to use the historic
misnomer "kmem".

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 1/8] mm: memcontrol: drop unused @css argument in memcg_init_kmem
  2015-12-08 18:34   ` Johannes Weiner
  (?)
@ 2015-12-10 12:37     ` Michal Hocko
  -1 siblings, 0 replies; 79+ messages in thread
From: Michal Hocko @ 2015-12-10 12:37 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Vladimir Davydov, linux-mm, cgroups, linux-kernel,
	kernel-team

On Tue 08-12-15 13:34:18, Johannes Weiner wrote:
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  include/net/tcp_memcontrol.h | 3 ++-
>  mm/memcontrol.c              | 6 +++---
>  net/ipv4/tcp_memcontrol.c    | 2 +-
>  3 files changed, 6 insertions(+), 5 deletions(-)
> 
> diff --git a/include/net/tcp_memcontrol.h b/include/net/tcp_memcontrol.h
> index 3a17b16..dc2da2f 100644
> --- a/include/net/tcp_memcontrol.h
> +++ b/include/net/tcp_memcontrol.h
> @@ -1,6 +1,7 @@
>  #ifndef _TCP_MEMCG_H
>  #define _TCP_MEMCG_H
>  
> -int tcp_init_cgroup(struct mem_cgroup *memcg, struct cgroup_subsys *ss);
> +int tcp_init_cgroup(struct mem_cgroup *memcg);
>  void tcp_destroy_cgroup(struct mem_cgroup *memcg);
> +
>  #endif /* _TCP_MEMCG_H */
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 5fe45d68..eda8d43 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -3561,7 +3561,7 @@ static int mem_cgroup_oom_control_write(struct cgroup_subsys_state *css,
>  }
>  
>  #ifdef CONFIG_MEMCG_KMEM
> -static int memcg_init_kmem(struct mem_cgroup *memcg, struct cgroup_subsys *ss)
> +static int memcg_init_kmem(struct mem_cgroup *memcg)
>  {
>  	int ret;
>  
> @@ -3569,7 +3569,7 @@ static int memcg_init_kmem(struct mem_cgroup *memcg, struct cgroup_subsys *ss)
>  	if (ret)
>  		return ret;
>  
> -	return tcp_init_cgroup(memcg, ss);
> +	return tcp_init_cgroup(memcg);
>  }
>  
>  static void memcg_deactivate_kmem(struct mem_cgroup *memcg)
> @@ -4252,7 +4252,7 @@ mem_cgroup_css_online(struct cgroup_subsys_state *css)
>  	}
>  	mutex_unlock(&memcg_create_mutex);
>  
> -	ret = memcg_init_kmem(memcg, &memory_cgrp_subsys);
> +	ret = memcg_init_kmem(memcg);
>  	if (ret)
>  		return ret;
>  
> diff --git a/net/ipv4/tcp_memcontrol.c b/net/ipv4/tcp_memcontrol.c
> index 18bc7f7..133eb5e 100644
> --- a/net/ipv4/tcp_memcontrol.c
> +++ b/net/ipv4/tcp_memcontrol.c
> @@ -6,7 +6,7 @@
>  #include <linux/memcontrol.h>
>  #include <linux/module.h>
>  
> -int tcp_init_cgroup(struct mem_cgroup *memcg, struct cgroup_subsys *ss)
> +int tcp_init_cgroup(struct mem_cgroup *memcg)
>  {
>  	struct mem_cgroup *parent = parent_mem_cgroup(memcg);
>  	struct page_counter *counter_parent = NULL;
> -- 
> 2.6.3

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 1/8] mm: memcontrol: drop unused @css argument in memcg_init_kmem
@ 2015-12-10 12:37     ` Michal Hocko
  0 siblings, 0 replies; 79+ messages in thread
From: Michal Hocko @ 2015-12-10 12:37 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Vladimir Davydov, linux-mm, cgroups, linux-kernel,
	kernel-team

On Tue 08-12-15 13:34:18, Johannes Weiner wrote:
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  include/net/tcp_memcontrol.h | 3 ++-
>  mm/memcontrol.c              | 6 +++---
>  net/ipv4/tcp_memcontrol.c    | 2 +-
>  3 files changed, 6 insertions(+), 5 deletions(-)
> 
> diff --git a/include/net/tcp_memcontrol.h b/include/net/tcp_memcontrol.h
> index 3a17b16..dc2da2f 100644
> --- a/include/net/tcp_memcontrol.h
> +++ b/include/net/tcp_memcontrol.h
> @@ -1,6 +1,7 @@
>  #ifndef _TCP_MEMCG_H
>  #define _TCP_MEMCG_H
>  
> -int tcp_init_cgroup(struct mem_cgroup *memcg, struct cgroup_subsys *ss);
> +int tcp_init_cgroup(struct mem_cgroup *memcg);
>  void tcp_destroy_cgroup(struct mem_cgroup *memcg);
> +
>  #endif /* _TCP_MEMCG_H */
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 5fe45d68..eda8d43 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -3561,7 +3561,7 @@ static int mem_cgroup_oom_control_write(struct cgroup_subsys_state *css,
>  }
>  
>  #ifdef CONFIG_MEMCG_KMEM
> -static int memcg_init_kmem(struct mem_cgroup *memcg, struct cgroup_subsys *ss)
> +static int memcg_init_kmem(struct mem_cgroup *memcg)
>  {
>  	int ret;
>  
> @@ -3569,7 +3569,7 @@ static int memcg_init_kmem(struct mem_cgroup *memcg, struct cgroup_subsys *ss)
>  	if (ret)
>  		return ret;
>  
> -	return tcp_init_cgroup(memcg, ss);
> +	return tcp_init_cgroup(memcg);
>  }
>  
>  static void memcg_deactivate_kmem(struct mem_cgroup *memcg)
> @@ -4252,7 +4252,7 @@ mem_cgroup_css_online(struct cgroup_subsys_state *css)
>  	}
>  	mutex_unlock(&memcg_create_mutex);
>  
> -	ret = memcg_init_kmem(memcg, &memory_cgrp_subsys);
> +	ret = memcg_init_kmem(memcg);
>  	if (ret)
>  		return ret;
>  
> diff --git a/net/ipv4/tcp_memcontrol.c b/net/ipv4/tcp_memcontrol.c
> index 18bc7f7..133eb5e 100644
> --- a/net/ipv4/tcp_memcontrol.c
> +++ b/net/ipv4/tcp_memcontrol.c
> @@ -6,7 +6,7 @@
>  #include <linux/memcontrol.h>
>  #include <linux/module.h>
>  
> -int tcp_init_cgroup(struct mem_cgroup *memcg, struct cgroup_subsys *ss)
> +int tcp_init_cgroup(struct mem_cgroup *memcg)
>  {
>  	struct mem_cgroup *parent = parent_mem_cgroup(memcg);
>  	struct page_counter *counter_parent = NULL;
> -- 
> 2.6.3

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 1/8] mm: memcontrol: drop unused @css argument in memcg_init_kmem
@ 2015-12-10 12:37     ` Michal Hocko
  0 siblings, 0 replies; 79+ messages in thread
From: Michal Hocko @ 2015-12-10 12:37 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Vladimir Davydov, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, kernel-team-b10kYP2dOMg

On Tue 08-12-15 13:34:18, Johannes Weiner wrote:
> Signed-off-by: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>

Acked-by: Michal Hocko <mhocko-IBi9RG/b67k@public.gmane.org>

> ---
>  include/net/tcp_memcontrol.h | 3 ++-
>  mm/memcontrol.c              | 6 +++---
>  net/ipv4/tcp_memcontrol.c    | 2 +-
>  3 files changed, 6 insertions(+), 5 deletions(-)
> 
> diff --git a/include/net/tcp_memcontrol.h b/include/net/tcp_memcontrol.h
> index 3a17b16..dc2da2f 100644
> --- a/include/net/tcp_memcontrol.h
> +++ b/include/net/tcp_memcontrol.h
> @@ -1,6 +1,7 @@
>  #ifndef _TCP_MEMCG_H
>  #define _TCP_MEMCG_H
>  
> -int tcp_init_cgroup(struct mem_cgroup *memcg, struct cgroup_subsys *ss);
> +int tcp_init_cgroup(struct mem_cgroup *memcg);
>  void tcp_destroy_cgroup(struct mem_cgroup *memcg);
> +
>  #endif /* _TCP_MEMCG_H */
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 5fe45d68..eda8d43 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -3561,7 +3561,7 @@ static int mem_cgroup_oom_control_write(struct cgroup_subsys_state *css,
>  }
>  
>  #ifdef CONFIG_MEMCG_KMEM
> -static int memcg_init_kmem(struct mem_cgroup *memcg, struct cgroup_subsys *ss)
> +static int memcg_init_kmem(struct mem_cgroup *memcg)
>  {
>  	int ret;
>  
> @@ -3569,7 +3569,7 @@ static int memcg_init_kmem(struct mem_cgroup *memcg, struct cgroup_subsys *ss)
>  	if (ret)
>  		return ret;
>  
> -	return tcp_init_cgroup(memcg, ss);
> +	return tcp_init_cgroup(memcg);
>  }
>  
>  static void memcg_deactivate_kmem(struct mem_cgroup *memcg)
> @@ -4252,7 +4252,7 @@ mem_cgroup_css_online(struct cgroup_subsys_state *css)
>  	}
>  	mutex_unlock(&memcg_create_mutex);
>  
> -	ret = memcg_init_kmem(memcg, &memory_cgrp_subsys);
> +	ret = memcg_init_kmem(memcg);
>  	if (ret)
>  		return ret;
>  
> diff --git a/net/ipv4/tcp_memcontrol.c b/net/ipv4/tcp_memcontrol.c
> index 18bc7f7..133eb5e 100644
> --- a/net/ipv4/tcp_memcontrol.c
> +++ b/net/ipv4/tcp_memcontrol.c
> @@ -6,7 +6,7 @@
>  #include <linux/memcontrol.h>
>  #include <linux/module.h>
>  
> -int tcp_init_cgroup(struct mem_cgroup *memcg, struct cgroup_subsys *ss)
> +int tcp_init_cgroup(struct mem_cgroup *memcg)
>  {
>  	struct mem_cgroup *parent = parent_mem_cgroup(memcg);
>  	struct page_counter *counter_parent = NULL;
> -- 
> 2.6.3

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 2/8] mm: memcontrol: remove double kmem page_counter init
  2015-12-08 18:34   ` Johannes Weiner
  (?)
@ 2015-12-10 12:40     ` Michal Hocko
  -1 siblings, 0 replies; 79+ messages in thread
From: Michal Hocko @ 2015-12-10 12:40 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Vladimir Davydov, linux-mm, cgroups, linux-kernel,
	kernel-team

On Tue 08-12-15 13:34:19, Johannes Weiner wrote:
> The kmem page_counter's limit is initialized to PAGE_COUNTER_MAX
> inside mem_cgroup_css_online(). There is no need to repeat this
> from memcg_propagate_kmem().
> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  mm/memcontrol.c | 24 ++++++++++--------------
>  1 file changed, 10 insertions(+), 14 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index eda8d43..02167db 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2840,8 +2840,7 @@ static u64 mem_cgroup_read_u64(struct cgroup_subsys_state *css,
>  }
>  
>  #ifdef CONFIG_MEMCG_KMEM
> -static int memcg_activate_kmem(struct mem_cgroup *memcg,
> -			       unsigned long nr_pages)
> +static int memcg_activate_kmem(struct mem_cgroup *memcg)
>  {
>  	int err = 0;
>  	int memcg_id;
> @@ -2876,13 +2875,6 @@ static int memcg_activate_kmem(struct mem_cgroup *memcg,
>  		goto out;
>  	}
>  
> -	/*
> -	 * We couldn't have accounted to this cgroup, because it hasn't got
> -	 * activated yet, so this should succeed.
> -	 */
> -	err = page_counter_limit(&memcg->kmem, nr_pages);
> -	VM_BUG_ON(err);
> -
>  	static_branch_inc(&memcg_kmem_enabled_key);
>  	/*
>  	 * A memory cgroup is considered kmem-active as soon as it gets
> @@ -2903,10 +2895,14 @@ static int memcg_update_kmem_limit(struct mem_cgroup *memcg,
>  	int ret;
>  
>  	mutex_lock(&memcg_limit_mutex);
> -	if (!memcg_kmem_is_active(memcg))
> -		ret = memcg_activate_kmem(memcg, limit);
> -	else
> -		ret = page_counter_limit(&memcg->kmem, limit);
> +	/* Top-level cgroup doesn't propagate from root */
> +	if (!memcg_kmem_is_active(memcg)) {
> +		ret = memcg_activate_kmem(memcg);
> +		if (ret)
> +			goto out;
> +	}
> +	ret = page_counter_limit(&memcg->kmem, limit);
> +out:
>  	mutex_unlock(&memcg_limit_mutex);
>  	return ret;
>  }
> @@ -2925,7 +2921,7 @@ static int memcg_propagate_kmem(struct mem_cgroup *memcg)
>  	 * after this point, because it has at least one child already.
>  	 */
>  	if (memcg_kmem_is_active(parent))
> -		ret = memcg_activate_kmem(memcg, PAGE_COUNTER_MAX);
> +		ret = memcg_activate_kmem(memcg);
>  	mutex_unlock(&memcg_limit_mutex);
>  	return ret;
>  }
> -- 
> 2.6.3

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 2/8] mm: memcontrol: remove double kmem page_counter init
@ 2015-12-10 12:40     ` Michal Hocko
  0 siblings, 0 replies; 79+ messages in thread
From: Michal Hocko @ 2015-12-10 12:40 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Vladimir Davydov, linux-mm, cgroups, linux-kernel,
	kernel-team

On Tue 08-12-15 13:34:19, Johannes Weiner wrote:
> The kmem page_counter's limit is initialized to PAGE_COUNTER_MAX
> inside mem_cgroup_css_online(). There is no need to repeat this
> from memcg_propagate_kmem().
> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  mm/memcontrol.c | 24 ++++++++++--------------
>  1 file changed, 10 insertions(+), 14 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index eda8d43..02167db 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2840,8 +2840,7 @@ static u64 mem_cgroup_read_u64(struct cgroup_subsys_state *css,
>  }
>  
>  #ifdef CONFIG_MEMCG_KMEM
> -static int memcg_activate_kmem(struct mem_cgroup *memcg,
> -			       unsigned long nr_pages)
> +static int memcg_activate_kmem(struct mem_cgroup *memcg)
>  {
>  	int err = 0;
>  	int memcg_id;
> @@ -2876,13 +2875,6 @@ static int memcg_activate_kmem(struct mem_cgroup *memcg,
>  		goto out;
>  	}
>  
> -	/*
> -	 * We couldn't have accounted to this cgroup, because it hasn't got
> -	 * activated yet, so this should succeed.
> -	 */
> -	err = page_counter_limit(&memcg->kmem, nr_pages);
> -	VM_BUG_ON(err);
> -
>  	static_branch_inc(&memcg_kmem_enabled_key);
>  	/*
>  	 * A memory cgroup is considered kmem-active as soon as it gets
> @@ -2903,10 +2895,14 @@ static int memcg_update_kmem_limit(struct mem_cgroup *memcg,
>  	int ret;
>  
>  	mutex_lock(&memcg_limit_mutex);
> -	if (!memcg_kmem_is_active(memcg))
> -		ret = memcg_activate_kmem(memcg, limit);
> -	else
> -		ret = page_counter_limit(&memcg->kmem, limit);
> +	/* Top-level cgroup doesn't propagate from root */
> +	if (!memcg_kmem_is_active(memcg)) {
> +		ret = memcg_activate_kmem(memcg);
> +		if (ret)
> +			goto out;
> +	}
> +	ret = page_counter_limit(&memcg->kmem, limit);
> +out:
>  	mutex_unlock(&memcg_limit_mutex);
>  	return ret;
>  }
> @@ -2925,7 +2921,7 @@ static int memcg_propagate_kmem(struct mem_cgroup *memcg)
>  	 * after this point, because it has at least one child already.
>  	 */
>  	if (memcg_kmem_is_active(parent))
> -		ret = memcg_activate_kmem(memcg, PAGE_COUNTER_MAX);
> +		ret = memcg_activate_kmem(memcg);
>  	mutex_unlock(&memcg_limit_mutex);
>  	return ret;
>  }
> -- 
> 2.6.3

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 2/8] mm: memcontrol: remove double kmem page_counter init
@ 2015-12-10 12:40     ` Michal Hocko
  0 siblings, 0 replies; 79+ messages in thread
From: Michal Hocko @ 2015-12-10 12:40 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Vladimir Davydov, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, kernel-team-b10kYP2dOMg

On Tue 08-12-15 13:34:19, Johannes Weiner wrote:
> The kmem page_counter's limit is initialized to PAGE_COUNTER_MAX
> inside mem_cgroup_css_online(). There is no need to repeat this
> from memcg_propagate_kmem().
> 
> Signed-off-by: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>

Acked-by: Michal Hocko <mhocko-IBi9RG/b67k@public.gmane.org>

> ---
>  mm/memcontrol.c | 24 ++++++++++--------------
>  1 file changed, 10 insertions(+), 14 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index eda8d43..02167db 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2840,8 +2840,7 @@ static u64 mem_cgroup_read_u64(struct cgroup_subsys_state *css,
>  }
>  
>  #ifdef CONFIG_MEMCG_KMEM
> -static int memcg_activate_kmem(struct mem_cgroup *memcg,
> -			       unsigned long nr_pages)
> +static int memcg_activate_kmem(struct mem_cgroup *memcg)
>  {
>  	int err = 0;
>  	int memcg_id;
> @@ -2876,13 +2875,6 @@ static int memcg_activate_kmem(struct mem_cgroup *memcg,
>  		goto out;
>  	}
>  
> -	/*
> -	 * We couldn't have accounted to this cgroup, because it hasn't got
> -	 * activated yet, so this should succeed.
> -	 */
> -	err = page_counter_limit(&memcg->kmem, nr_pages);
> -	VM_BUG_ON(err);
> -
>  	static_branch_inc(&memcg_kmem_enabled_key);
>  	/*
>  	 * A memory cgroup is considered kmem-active as soon as it gets
> @@ -2903,10 +2895,14 @@ static int memcg_update_kmem_limit(struct mem_cgroup *memcg,
>  	int ret;
>  
>  	mutex_lock(&memcg_limit_mutex);
> -	if (!memcg_kmem_is_active(memcg))
> -		ret = memcg_activate_kmem(memcg, limit);
> -	else
> -		ret = page_counter_limit(&memcg->kmem, limit);
> +	/* Top-level cgroup doesn't propagate from root */
> +	if (!memcg_kmem_is_active(memcg)) {
> +		ret = memcg_activate_kmem(memcg);
> +		if (ret)
> +			goto out;
> +	}
> +	ret = page_counter_limit(&memcg->kmem, limit);
> +out:
>  	mutex_unlock(&memcg_limit_mutex);
>  	return ret;
>  }
> @@ -2925,7 +2921,7 @@ static int memcg_propagate_kmem(struct mem_cgroup *memcg)
>  	 * after this point, because it has at least one child already.
>  	 */
>  	if (memcg_kmem_is_active(parent))
> -		ret = memcg_activate_kmem(memcg, PAGE_COUNTER_MAX);
> +		ret = memcg_activate_kmem(memcg);
>  	mutex_unlock(&memcg_limit_mutex);
>  	return ret;
>  }
> -- 
> 2.6.3

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 3/8] mm: memcontrol: give the kmem states more descriptive names
  2015-12-08 18:34   ` Johannes Weiner
@ 2015-12-10 12:47     ` Michal Hocko
  -1 siblings, 0 replies; 79+ messages in thread
From: Michal Hocko @ 2015-12-10 12:47 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Vladimir Davydov, linux-mm, cgroups, linux-kernel,
	kernel-team

On Tue 08-12-15 13:34:20, Johannes Weiner wrote:
> On any given memcg, the kmem accounting feature has three separate
> states: not initialized, structures allocated, and actively accounting
> slab memory. These are represented through a combination of the
> kmem_acct_activated and kmem_acct_active flags, which is confusing.
> 
> Convert to a kmem_state enum with the states NONE, ALLOCATED, and
> ONLINE. Then rename the functions to modify the state accordingly.
> This follows the nomenclature of css object states more closely.

I like this! It is much easier to follow than two separate flags.
 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

Acked-by: Michal Hocko <mhocko@suse.com>
> ---
>  include/linux/memcontrol.h | 15 ++++++++-----
>  mm/memcontrol.c            | 52 ++++++++++++++++++++++------------------------
>  mm/slab_common.c           |  4 ++--
>  mm/vmscan.c                |  2 +-
>  4 files changed, 38 insertions(+), 35 deletions(-)
> 
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index 189f04d..54dab4d 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -152,6 +152,12 @@ struct mem_cgroup_thresholds {
>  	struct mem_cgroup_threshold_ary *spare;
>  };
>  
> +enum memcg_kmem_state {
> +	KMEM_NONE,
> +	KMEM_ALLOCATED,
> +	KMEM_ONLINE,
> +};
> +
>  /*
>   * The memory controller data structure. The memory controller controls both
>   * page cache and RSS per cgroup. We would eventually like to provide
> @@ -233,8 +239,7 @@ struct mem_cgroup {
>  #if defined(CONFIG_MEMCG_KMEM)
>          /* Index in the kmem_cache->memcg_params.memcg_caches array */
>  	int kmemcg_id;
> -	bool kmem_acct_activated;
> -	bool kmem_acct_active;
> +	enum memcg_kmem_state kmem_state;
>  #endif
>  
>  	int last_scanned_node;
> @@ -750,9 +755,9 @@ static inline bool memcg_kmem_enabled(void)
>  	return static_branch_unlikely(&memcg_kmem_enabled_key);
>  }
>  
> -static inline bool memcg_kmem_is_active(struct mem_cgroup *memcg)
> +static inline bool memcg_kmem_online(struct mem_cgroup *memcg)
>  {
> -	return memcg->kmem_acct_active;
> +	return memcg->kmem_state == KMEM_ONLINE;
>  }
>  
>  /*
> @@ -850,7 +855,7 @@ static inline bool memcg_kmem_enabled(void)
>  	return false;
>  }
>  
> -static inline bool memcg_kmem_is_active(struct mem_cgroup *memcg)
> +static inline bool memcg_kmem_online(struct mem_cgroup *memcg)
>  {
>  	return false;
>  }
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 02167db..22b8c4f 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2357,7 +2357,7 @@ int __memcg_kmem_charge_memcg(struct page *page, gfp_t gfp, int order,
>  	struct page_counter *counter;
>  	int ret;
>  
> -	if (!memcg_kmem_is_active(memcg))
> +	if (!memcg_kmem_online(memcg))
>  		return 0;
>  
>  	if (!page_counter_try_charge(&memcg->kmem, nr_pages, &counter))
> @@ -2840,14 +2840,13 @@ static u64 mem_cgroup_read_u64(struct cgroup_subsys_state *css,
>  }
>  
>  #ifdef CONFIG_MEMCG_KMEM
> -static int memcg_activate_kmem(struct mem_cgroup *memcg)
> +static int memcg_online_kmem(struct mem_cgroup *memcg)
>  {
>  	int err = 0;
>  	int memcg_id;
>  
>  	BUG_ON(memcg->kmemcg_id >= 0);
> -	BUG_ON(memcg->kmem_acct_activated);
> -	BUG_ON(memcg->kmem_acct_active);
> +	BUG_ON(memcg->kmem_state);
>  
>  	/*
>  	 * For simplicity, we won't allow this to be disabled.  It also can't
> @@ -2877,14 +2876,13 @@ static int memcg_activate_kmem(struct mem_cgroup *memcg)
>  
>  	static_branch_inc(&memcg_kmem_enabled_key);
>  	/*
> -	 * A memory cgroup is considered kmem-active as soon as it gets
> +	 * A memory cgroup is considered kmem-online as soon as it gets
>  	 * kmemcg_id. Setting the id after enabling static branching will
>  	 * guarantee no one starts accounting before all call sites are
>  	 * patched.
>  	 */
>  	memcg->kmemcg_id = memcg_id;
> -	memcg->kmem_acct_activated = true;
> -	memcg->kmem_acct_active = true;
> +	memcg->kmem_state = KMEM_ONLINE;
>  out:
>  	return err;
>  }
> @@ -2896,8 +2894,8 @@ static int memcg_update_kmem_limit(struct mem_cgroup *memcg,
>  
>  	mutex_lock(&memcg_limit_mutex);
>  	/* Top-level cgroup doesn't propagate from root */
> -	if (!memcg_kmem_is_active(memcg)) {
> -		ret = memcg_activate_kmem(memcg);
> +	if (!memcg_kmem_online(memcg)) {
> +		ret = memcg_online_kmem(memcg);
>  		if (ret)
>  			goto out;
>  	}
> @@ -2917,11 +2915,12 @@ static int memcg_propagate_kmem(struct mem_cgroup *memcg)
>  
>  	mutex_lock(&memcg_limit_mutex);
>  	/*
> -	 * If the parent cgroup is not kmem-active now, it cannot be activated
> -	 * after this point, because it has at least one child already.
> +	 * If the parent cgroup is not kmem-online now, it cannot be
> +	 * onlined after this point, because it has at least one child
> +	 * already.
>  	 */
> -	if (memcg_kmem_is_active(parent))
> -		ret = memcg_activate_kmem(memcg);
> +	if (memcg_kmem_online(parent))
> +		ret = memcg_online_kmem(memcg);
>  	mutex_unlock(&memcg_limit_mutex);
>  	return ret;
>  }
> @@ -3568,22 +3567,21 @@ static int memcg_init_kmem(struct mem_cgroup *memcg)
>  	return tcp_init_cgroup(memcg);
>  }
>  
> -static void memcg_deactivate_kmem(struct mem_cgroup *memcg)
> +static void memcg_offline_kmem(struct mem_cgroup *memcg)
>  {
>  	struct cgroup_subsys_state *css;
>  	struct mem_cgroup *parent, *child;
>  	int kmemcg_id;
>  
> -	if (!memcg->kmem_acct_active)
> +	if (memcg->kmem_state != KMEM_ONLINE)
>  		return;
> -
>  	/*
> -	 * Clear the 'active' flag before clearing memcg_caches arrays entries.
> -	 * Since we take the slab_mutex in memcg_deactivate_kmem_caches(), it
> -	 * guarantees no cache will be created for this cgroup after we are
> -	 * done (see memcg_create_kmem_cache()).
> +	 * Clear the online state before clearing memcg_caches array
> +	 * entries. The slab_mutex in memcg_deactivate_kmem_caches()
> +	 * guarantees that no cache will be created for this cgroup
> +	 * after we are done (see memcg_create_kmem_cache()).
>  	 */
> -	memcg->kmem_acct_active = false;
> +	memcg->kmem_state = KMEM_ALLOCATED;
>  
>  	memcg_deactivate_kmem_caches(memcg);
>  
> @@ -3614,9 +3612,9 @@ static void memcg_deactivate_kmem(struct mem_cgroup *memcg)
>  	memcg_free_cache_id(kmemcg_id);
>  }
>  
> -static void memcg_destroy_kmem(struct mem_cgroup *memcg)
> +static void memcg_free_kmem(struct mem_cgroup *memcg)
>  {
> -	if (memcg->kmem_acct_activated) {
> +	if (memcg->kmem_state == KMEM_ALLOCATED) {
>  		memcg_destroy_kmem_caches(memcg);
>  		static_branch_dec(&memcg_kmem_enabled_key);
>  		WARN_ON(page_counter_read(&memcg->kmem));
> @@ -3629,11 +3627,11 @@ static int memcg_init_kmem(struct mem_cgroup *memcg, struct cgroup_subsys *ss)
>  	return 0;
>  }
>  
> -static void memcg_deactivate_kmem(struct mem_cgroup *memcg)
> +static void memcg_offline_kmem(struct mem_cgroup *memcg)
>  {
>  }
>  
> -static void memcg_destroy_kmem(struct mem_cgroup *memcg)
> +static void memcg_free_kmem(struct mem_cgroup *memcg)
>  {
>  }
>  #endif
> @@ -4286,7 +4284,7 @@ static void mem_cgroup_css_offline(struct cgroup_subsys_state *css)
>  
>  	vmpressure_cleanup(&memcg->vmpressure);
>  
> -	memcg_deactivate_kmem(memcg);
> +	memcg_offline_kmem(memcg);
>  
>  	wb_memcg_offline(memcg);
>  }
> @@ -4295,7 +4293,7 @@ static void mem_cgroup_css_free(struct cgroup_subsys_state *css)
>  {
>  	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
>  
> -	memcg_destroy_kmem(memcg);
> +	memcg_free_kmem(memcg);
>  #ifdef CONFIG_INET
>  	if (cgroup_subsys_on_dfl(memory_cgrp_subsys) && !cgroup_memory_nosocket)
>  		static_branch_dec(&memcg_sockets_enabled_key);
> diff --git a/mm/slab_common.c b/mm/slab_common.c
> index e016178..8c262e6 100644
> --- a/mm/slab_common.c
> +++ b/mm/slab_common.c
> @@ -503,10 +503,10 @@ void memcg_create_kmem_cache(struct mem_cgroup *memcg,
>  	mutex_lock(&slab_mutex);
>  
>  	/*
> -	 * The memory cgroup could have been deactivated while the cache
> +	 * The memory cgroup could have been offlined while the cache
>  	 * creation work was pending.
>  	 */
> -	if (!memcg_kmem_is_active(memcg))
> +	if (!memcg_kmem_online(memcg))
>  		goto out_unlock;
>  
>  	idx = memcg_cache_id(memcg);
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 50e54c0..2dbc679 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -411,7 +411,7 @@ static unsigned long shrink_slab(gfp_t gfp_mask, int nid,
>  	struct shrinker *shrinker;
>  	unsigned long freed = 0;
>  
> -	if (memcg && !memcg_kmem_is_active(memcg))
> +	if (memcg && !memcg_kmem_online(memcg))
>  		return 0;
>  
>  	if (nr_scanned == 0)
> -- 
> 2.6.3

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 3/8] mm: memcontrol: give the kmem states more descriptive names
@ 2015-12-10 12:47     ` Michal Hocko
  0 siblings, 0 replies; 79+ messages in thread
From: Michal Hocko @ 2015-12-10 12:47 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Vladimir Davydov, linux-mm, cgroups, linux-kernel,
	kernel-team

On Tue 08-12-15 13:34:20, Johannes Weiner wrote:
> On any given memcg, the kmem accounting feature has three separate
> states: not initialized, structures allocated, and actively accounting
> slab memory. These are represented through a combination of the
> kmem_acct_activated and kmem_acct_active flags, which is confusing.
> 
> Convert to a kmem_state enum with the states NONE, ALLOCATED, and
> ONLINE. Then rename the functions to modify the state accordingly.
> This follows the nomenclature of css object states more closely.

I like this! It is much easier to follow than two separate flags.
 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

Acked-by: Michal Hocko <mhocko@suse.com>
> ---
>  include/linux/memcontrol.h | 15 ++++++++-----
>  mm/memcontrol.c            | 52 ++++++++++++++++++++++------------------------
>  mm/slab_common.c           |  4 ++--
>  mm/vmscan.c                |  2 +-
>  4 files changed, 38 insertions(+), 35 deletions(-)
> 
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index 189f04d..54dab4d 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -152,6 +152,12 @@ struct mem_cgroup_thresholds {
>  	struct mem_cgroup_threshold_ary *spare;
>  };
>  
> +enum memcg_kmem_state {
> +	KMEM_NONE,
> +	KMEM_ALLOCATED,
> +	KMEM_ONLINE,
> +};
> +
>  /*
>   * The memory controller data structure. The memory controller controls both
>   * page cache and RSS per cgroup. We would eventually like to provide
> @@ -233,8 +239,7 @@ struct mem_cgroup {
>  #if defined(CONFIG_MEMCG_KMEM)
>          /* Index in the kmem_cache->memcg_params.memcg_caches array */
>  	int kmemcg_id;
> -	bool kmem_acct_activated;
> -	bool kmem_acct_active;
> +	enum memcg_kmem_state kmem_state;
>  #endif
>  
>  	int last_scanned_node;
> @@ -750,9 +755,9 @@ static inline bool memcg_kmem_enabled(void)
>  	return static_branch_unlikely(&memcg_kmem_enabled_key);
>  }
>  
> -static inline bool memcg_kmem_is_active(struct mem_cgroup *memcg)
> +static inline bool memcg_kmem_online(struct mem_cgroup *memcg)
>  {
> -	return memcg->kmem_acct_active;
> +	return memcg->kmem_state == KMEM_ONLINE;
>  }
>  
>  /*
> @@ -850,7 +855,7 @@ static inline bool memcg_kmem_enabled(void)
>  	return false;
>  }
>  
> -static inline bool memcg_kmem_is_active(struct mem_cgroup *memcg)
> +static inline bool memcg_kmem_online(struct mem_cgroup *memcg)
>  {
>  	return false;
>  }
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 02167db..22b8c4f 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2357,7 +2357,7 @@ int __memcg_kmem_charge_memcg(struct page *page, gfp_t gfp, int order,
>  	struct page_counter *counter;
>  	int ret;
>  
> -	if (!memcg_kmem_is_active(memcg))
> +	if (!memcg_kmem_online(memcg))
>  		return 0;
>  
>  	if (!page_counter_try_charge(&memcg->kmem, nr_pages, &counter))
> @@ -2840,14 +2840,13 @@ static u64 mem_cgroup_read_u64(struct cgroup_subsys_state *css,
>  }
>  
>  #ifdef CONFIG_MEMCG_KMEM
> -static int memcg_activate_kmem(struct mem_cgroup *memcg)
> +static int memcg_online_kmem(struct mem_cgroup *memcg)
>  {
>  	int err = 0;
>  	int memcg_id;
>  
>  	BUG_ON(memcg->kmemcg_id >= 0);
> -	BUG_ON(memcg->kmem_acct_activated);
> -	BUG_ON(memcg->kmem_acct_active);
> +	BUG_ON(memcg->kmem_state);
>  
>  	/*
>  	 * For simplicity, we won't allow this to be disabled.  It also can't
> @@ -2877,14 +2876,13 @@ static int memcg_activate_kmem(struct mem_cgroup *memcg)
>  
>  	static_branch_inc(&memcg_kmem_enabled_key);
>  	/*
> -	 * A memory cgroup is considered kmem-active as soon as it gets
> +	 * A memory cgroup is considered kmem-online as soon as it gets
>  	 * kmemcg_id. Setting the id after enabling static branching will
>  	 * guarantee no one starts accounting before all call sites are
>  	 * patched.
>  	 */
>  	memcg->kmemcg_id = memcg_id;
> -	memcg->kmem_acct_activated = true;
> -	memcg->kmem_acct_active = true;
> +	memcg->kmem_state = KMEM_ONLINE;
>  out:
>  	return err;
>  }
> @@ -2896,8 +2894,8 @@ static int memcg_update_kmem_limit(struct mem_cgroup *memcg,
>  
>  	mutex_lock(&memcg_limit_mutex);
>  	/* Top-level cgroup doesn't propagate from root */
> -	if (!memcg_kmem_is_active(memcg)) {
> -		ret = memcg_activate_kmem(memcg);
> +	if (!memcg_kmem_online(memcg)) {
> +		ret = memcg_online_kmem(memcg);
>  		if (ret)
>  			goto out;
>  	}
> @@ -2917,11 +2915,12 @@ static int memcg_propagate_kmem(struct mem_cgroup *memcg)
>  
>  	mutex_lock(&memcg_limit_mutex);
>  	/*
> -	 * If the parent cgroup is not kmem-active now, it cannot be activated
> -	 * after this point, because it has at least one child already.
> +	 * If the parent cgroup is not kmem-online now, it cannot be
> +	 * onlined after this point, because it has at least one child
> +	 * already.
>  	 */
> -	if (memcg_kmem_is_active(parent))
> -		ret = memcg_activate_kmem(memcg);
> +	if (memcg_kmem_online(parent))
> +		ret = memcg_online_kmem(memcg);
>  	mutex_unlock(&memcg_limit_mutex);
>  	return ret;
>  }
> @@ -3568,22 +3567,21 @@ static int memcg_init_kmem(struct mem_cgroup *memcg)
>  	return tcp_init_cgroup(memcg);
>  }
>  
> -static void memcg_deactivate_kmem(struct mem_cgroup *memcg)
> +static void memcg_offline_kmem(struct mem_cgroup *memcg)
>  {
>  	struct cgroup_subsys_state *css;
>  	struct mem_cgroup *parent, *child;
>  	int kmemcg_id;
>  
> -	if (!memcg->kmem_acct_active)
> +	if (memcg->kmem_state != KMEM_ONLINE)
>  		return;
> -
>  	/*
> -	 * Clear the 'active' flag before clearing memcg_caches arrays entries.
> -	 * Since we take the slab_mutex in memcg_deactivate_kmem_caches(), it
> -	 * guarantees no cache will be created for this cgroup after we are
> -	 * done (see memcg_create_kmem_cache()).
> +	 * Clear the online state before clearing memcg_caches array
> +	 * entries. The slab_mutex in memcg_deactivate_kmem_caches()
> +	 * guarantees that no cache will be created for this cgroup
> +	 * after we are done (see memcg_create_kmem_cache()).
>  	 */
> -	memcg->kmem_acct_active = false;
> +	memcg->kmem_state = KMEM_ALLOCATED;
>  
>  	memcg_deactivate_kmem_caches(memcg);
>  
> @@ -3614,9 +3612,9 @@ static void memcg_deactivate_kmem(struct mem_cgroup *memcg)
>  	memcg_free_cache_id(kmemcg_id);
>  }
>  
> -static void memcg_destroy_kmem(struct mem_cgroup *memcg)
> +static void memcg_free_kmem(struct mem_cgroup *memcg)
>  {
> -	if (memcg->kmem_acct_activated) {
> +	if (memcg->kmem_state == KMEM_ALLOCATED) {
>  		memcg_destroy_kmem_caches(memcg);
>  		static_branch_dec(&memcg_kmem_enabled_key);
>  		WARN_ON(page_counter_read(&memcg->kmem));
> @@ -3629,11 +3627,11 @@ static int memcg_init_kmem(struct mem_cgroup *memcg, struct cgroup_subsys *ss)
>  	return 0;
>  }
>  
> -static void memcg_deactivate_kmem(struct mem_cgroup *memcg)
> +static void memcg_offline_kmem(struct mem_cgroup *memcg)
>  {
>  }
>  
> -static void memcg_destroy_kmem(struct mem_cgroup *memcg)
> +static void memcg_free_kmem(struct mem_cgroup *memcg)
>  {
>  }
>  #endif
> @@ -4286,7 +4284,7 @@ static void mem_cgroup_css_offline(struct cgroup_subsys_state *css)
>  
>  	vmpressure_cleanup(&memcg->vmpressure);
>  
> -	memcg_deactivate_kmem(memcg);
> +	memcg_offline_kmem(memcg);
>  
>  	wb_memcg_offline(memcg);
>  }
> @@ -4295,7 +4293,7 @@ static void mem_cgroup_css_free(struct cgroup_subsys_state *css)
>  {
>  	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
>  
> -	memcg_destroy_kmem(memcg);
> +	memcg_free_kmem(memcg);
>  #ifdef CONFIG_INET
>  	if (cgroup_subsys_on_dfl(memory_cgrp_subsys) && !cgroup_memory_nosocket)
>  		static_branch_dec(&memcg_sockets_enabled_key);
> diff --git a/mm/slab_common.c b/mm/slab_common.c
> index e016178..8c262e6 100644
> --- a/mm/slab_common.c
> +++ b/mm/slab_common.c
> @@ -503,10 +503,10 @@ void memcg_create_kmem_cache(struct mem_cgroup *memcg,
>  	mutex_lock(&slab_mutex);
>  
>  	/*
> -	 * The memory cgroup could have been deactivated while the cache
> +	 * The memory cgroup could have been offlined while the cache
>  	 * creation work was pending.
>  	 */
> -	if (!memcg_kmem_is_active(memcg))
> +	if (!memcg_kmem_online(memcg))
>  		goto out_unlock;
>  
>  	idx = memcg_cache_id(memcg);
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 50e54c0..2dbc679 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -411,7 +411,7 @@ static unsigned long shrink_slab(gfp_t gfp_mask, int nid,
>  	struct shrinker *shrinker;
>  	unsigned long freed = 0;
>  
> -	if (memcg && !memcg_kmem_is_active(memcg))
> +	if (memcg && !memcg_kmem_online(memcg))
>  		return 0;
>  
>  	if (nr_scanned == 0)
> -- 
> 2.6.3

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 4/8] mm: memcontrol: group kmem init and exit functions together
  2015-12-08 18:34   ` Johannes Weiner
@ 2015-12-10 12:56     ` Michal Hocko
  -1 siblings, 0 replies; 79+ messages in thread
From: Michal Hocko @ 2015-12-10 12:56 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Vladimir Davydov, linux-mm, cgroups, linux-kernel,
	kernel-team

On Tue 08-12-15 13:34:21, Johannes Weiner wrote:
> Put all the related code to setup and teardown the kmem accounting
> state into the same location. No functional change intended.
> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  mm/memcontrol.c | 157 +++++++++++++++++++++++++++-----------------------------
>  1 file changed, 76 insertions(+), 81 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 22b8c4f..5118618 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2924,12 +2924,88 @@ static int memcg_propagate_kmem(struct mem_cgroup *memcg)
>  	mutex_unlock(&memcg_limit_mutex);
>  	return ret;
>  }
> +
> +static int memcg_init_kmem(struct mem_cgroup *memcg)
> +{
> +	int ret;
> +
> +	ret = memcg_propagate_kmem(memcg);
> +	if (ret)
> +		return ret;
> +
> +	return tcp_init_cgroup(memcg);
> +}
> +
> +static void memcg_offline_kmem(struct mem_cgroup *memcg)
> +{
> +	struct cgroup_subsys_state *css;
> +	struct mem_cgroup *parent, *child;
> +	int kmemcg_id;
> +
> +	if (memcg->kmem_state != KMEM_ONLINE)
> +		return;
> +	/*
> +	 * Clear the online state before clearing memcg_caches array
> +	 * entries. The slab_mutex in memcg_deactivate_kmem_caches()
> +	 * guarantees that no cache will be created for this cgroup
> +	 * after we are done (see memcg_create_kmem_cache()).
> +	 */
> +	memcg->kmem_state = KMEM_ALLOCATED;
> +
> +	memcg_deactivate_kmem_caches(memcg);
> +
> +	kmemcg_id = memcg->kmemcg_id;
> +	BUG_ON(kmemcg_id < 0);
> +
> +	parent = parent_mem_cgroup(memcg);
> +	if (!parent)
> +		parent = root_mem_cgroup;
> +
> +	/*
> +	 * Change kmemcg_id of this cgroup and all its descendants to the
> +	 * parent's id, and then move all entries from this cgroup's list_lrus
> +	 * to ones of the parent. After we have finished, all list_lrus
> +	 * corresponding to this cgroup are guaranteed to remain empty. The
> +	 * ordering is imposed by list_lru_node->lock taken by
> +	 * memcg_drain_all_list_lrus().
> +	 */
> +	css_for_each_descendant_pre(css, &memcg->css) {
> +		child = mem_cgroup_from_css(css);
> +		BUG_ON(child->kmemcg_id != kmemcg_id);
> +		child->kmemcg_id = parent->kmemcg_id;
> +		if (!memcg->use_hierarchy)
> +			break;
> +	}
> +	memcg_drain_all_list_lrus(kmemcg_id, parent->kmemcg_id);
> +
> +	memcg_free_cache_id(kmemcg_id);
> +}
> +
> +static void memcg_free_kmem(struct mem_cgroup *memcg)
> +{
> +	if (memcg->kmem_state == KMEM_ALLOCATED) {
> +		memcg_destroy_kmem_caches(memcg);
> +		static_branch_dec(&memcg_kmem_enabled_key);
> +		WARN_ON(page_counter_read(&memcg->kmem));
> +	}
> +	tcp_destroy_cgroup(memcg);
> +}
>  #else
>  static int memcg_update_kmem_limit(struct mem_cgroup *memcg,
>  				   unsigned long limit)
>  {
>  	return -EINVAL;
>  }
> +static int memcg_init_kmem(struct mem_cgroup *memcg)
> +{
> +	return 0;
> +}
> +static void memcg_offline_kmem(struct mem_cgroup *memcg)
> +{
> +}
> +static void memcg_free_kmem(struct mem_cgroup *memcg)
> +{
> +}
>  #endif /* CONFIG_MEMCG_KMEM */
>  
>  /*
> @@ -3555,87 +3631,6 @@ static int mem_cgroup_oom_control_write(struct cgroup_subsys_state *css,
>  	return 0;
>  }
>  
> -#ifdef CONFIG_MEMCG_KMEM
> -static int memcg_init_kmem(struct mem_cgroup *memcg)
> -{
> -	int ret;
> -
> -	ret = memcg_propagate_kmem(memcg);
> -	if (ret)
> -		return ret;
> -
> -	return tcp_init_cgroup(memcg);
> -}
> -
> -static void memcg_offline_kmem(struct mem_cgroup *memcg)
> -{
> -	struct cgroup_subsys_state *css;
> -	struct mem_cgroup *parent, *child;
> -	int kmemcg_id;
> -
> -	if (memcg->kmem_state != KMEM_ONLINE)
> -		return;
> -	/*
> -	 * Clear the online state before clearing memcg_caches array
> -	 * entries. The slab_mutex in memcg_deactivate_kmem_caches()
> -	 * guarantees that no cache will be created for this cgroup
> -	 * after we are done (see memcg_create_kmem_cache()).
> -	 */
> -	memcg->kmem_state = KMEM_ALLOCATED;
> -
> -	memcg_deactivate_kmem_caches(memcg);
> -
> -	kmemcg_id = memcg->kmemcg_id;
> -	BUG_ON(kmemcg_id < 0);
> -
> -	parent = parent_mem_cgroup(memcg);
> -	if (!parent)
> -		parent = root_mem_cgroup;
> -
> -	/*
> -	 * Change kmemcg_id of this cgroup and all its descendants to the
> -	 * parent's id, and then move all entries from this cgroup's list_lrus
> -	 * to ones of the parent. After we have finished, all list_lrus
> -	 * corresponding to this cgroup are guaranteed to remain empty. The
> -	 * ordering is imposed by list_lru_node->lock taken by
> -	 * memcg_drain_all_list_lrus().
> -	 */
> -	css_for_each_descendant_pre(css, &memcg->css) {
> -		child = mem_cgroup_from_css(css);
> -		BUG_ON(child->kmemcg_id != kmemcg_id);
> -		child->kmemcg_id = parent->kmemcg_id;
> -		if (!memcg->use_hierarchy)
> -			break;
> -	}
> -	memcg_drain_all_list_lrus(kmemcg_id, parent->kmemcg_id);
> -
> -	memcg_free_cache_id(kmemcg_id);
> -}
> -
> -static void memcg_free_kmem(struct mem_cgroup *memcg)
> -{
> -	if (memcg->kmem_state == KMEM_ALLOCATED) {
> -		memcg_destroy_kmem_caches(memcg);
> -		static_branch_dec(&memcg_kmem_enabled_key);
> -		WARN_ON(page_counter_read(&memcg->kmem));
> -	}
> -	tcp_destroy_cgroup(memcg);
> -}
> -#else
> -static int memcg_init_kmem(struct mem_cgroup *memcg, struct cgroup_subsys *ss)
> -{
> -	return 0;
> -}
> -
> -static void memcg_offline_kmem(struct mem_cgroup *memcg)
> -{
> -}
> -
> -static void memcg_free_kmem(struct mem_cgroup *memcg)
> -{
> -}
> -#endif
> -
>  #ifdef CONFIG_CGROUP_WRITEBACK
>  
>  struct list_head *mem_cgroup_cgwb_list(struct mem_cgroup *memcg)
> -- 
> 2.6.3

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 4/8] mm: memcontrol: group kmem init and exit functions together
@ 2015-12-10 12:56     ` Michal Hocko
  0 siblings, 0 replies; 79+ messages in thread
From: Michal Hocko @ 2015-12-10 12:56 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Vladimir Davydov, linux-mm, cgroups, linux-kernel,
	kernel-team

On Tue 08-12-15 13:34:21, Johannes Weiner wrote:
> Put all the related code to setup and teardown the kmem accounting
> state into the same location. No functional change intended.
> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  mm/memcontrol.c | 157 +++++++++++++++++++++++++++-----------------------------
>  1 file changed, 76 insertions(+), 81 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 22b8c4f..5118618 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2924,12 +2924,88 @@ static int memcg_propagate_kmem(struct mem_cgroup *memcg)
>  	mutex_unlock(&memcg_limit_mutex);
>  	return ret;
>  }
> +
> +static int memcg_init_kmem(struct mem_cgroup *memcg)
> +{
> +	int ret;
> +
> +	ret = memcg_propagate_kmem(memcg);
> +	if (ret)
> +		return ret;
> +
> +	return tcp_init_cgroup(memcg);
> +}
> +
> +static void memcg_offline_kmem(struct mem_cgroup *memcg)
> +{
> +	struct cgroup_subsys_state *css;
> +	struct mem_cgroup *parent, *child;
> +	int kmemcg_id;
> +
> +	if (memcg->kmem_state != KMEM_ONLINE)
> +		return;
> +	/*
> +	 * Clear the online state before clearing memcg_caches array
> +	 * entries. The slab_mutex in memcg_deactivate_kmem_caches()
> +	 * guarantees that no cache will be created for this cgroup
> +	 * after we are done (see memcg_create_kmem_cache()).
> +	 */
> +	memcg->kmem_state = KMEM_ALLOCATED;
> +
> +	memcg_deactivate_kmem_caches(memcg);
> +
> +	kmemcg_id = memcg->kmemcg_id;
> +	BUG_ON(kmemcg_id < 0);
> +
> +	parent = parent_mem_cgroup(memcg);
> +	if (!parent)
> +		parent = root_mem_cgroup;
> +
> +	/*
> +	 * Change kmemcg_id of this cgroup and all its descendants to the
> +	 * parent's id, and then move all entries from this cgroup's list_lrus
> +	 * to ones of the parent. After we have finished, all list_lrus
> +	 * corresponding to this cgroup are guaranteed to remain empty. The
> +	 * ordering is imposed by list_lru_node->lock taken by
> +	 * memcg_drain_all_list_lrus().
> +	 */
> +	css_for_each_descendant_pre(css, &memcg->css) {
> +		child = mem_cgroup_from_css(css);
> +		BUG_ON(child->kmemcg_id != kmemcg_id);
> +		child->kmemcg_id = parent->kmemcg_id;
> +		if (!memcg->use_hierarchy)
> +			break;
> +	}
> +	memcg_drain_all_list_lrus(kmemcg_id, parent->kmemcg_id);
> +
> +	memcg_free_cache_id(kmemcg_id);
> +}
> +
> +static void memcg_free_kmem(struct mem_cgroup *memcg)
> +{
> +	if (memcg->kmem_state == KMEM_ALLOCATED) {
> +		memcg_destroy_kmem_caches(memcg);
> +		static_branch_dec(&memcg_kmem_enabled_key);
> +		WARN_ON(page_counter_read(&memcg->kmem));
> +	}
> +	tcp_destroy_cgroup(memcg);
> +}
>  #else
>  static int memcg_update_kmem_limit(struct mem_cgroup *memcg,
>  				   unsigned long limit)
>  {
>  	return -EINVAL;
>  }
> +static int memcg_init_kmem(struct mem_cgroup *memcg)
> +{
> +	return 0;
> +}
> +static void memcg_offline_kmem(struct mem_cgroup *memcg)
> +{
> +}
> +static void memcg_free_kmem(struct mem_cgroup *memcg)
> +{
> +}
>  #endif /* CONFIG_MEMCG_KMEM */
>  
>  /*
> @@ -3555,87 +3631,6 @@ static int mem_cgroup_oom_control_write(struct cgroup_subsys_state *css,
>  	return 0;
>  }
>  
> -#ifdef CONFIG_MEMCG_KMEM
> -static int memcg_init_kmem(struct mem_cgroup *memcg)
> -{
> -	int ret;
> -
> -	ret = memcg_propagate_kmem(memcg);
> -	if (ret)
> -		return ret;
> -
> -	return tcp_init_cgroup(memcg);
> -}
> -
> -static void memcg_offline_kmem(struct mem_cgroup *memcg)
> -{
> -	struct cgroup_subsys_state *css;
> -	struct mem_cgroup *parent, *child;
> -	int kmemcg_id;
> -
> -	if (memcg->kmem_state != KMEM_ONLINE)
> -		return;
> -	/*
> -	 * Clear the online state before clearing memcg_caches array
> -	 * entries. The slab_mutex in memcg_deactivate_kmem_caches()
> -	 * guarantees that no cache will be created for this cgroup
> -	 * after we are done (see memcg_create_kmem_cache()).
> -	 */
> -	memcg->kmem_state = KMEM_ALLOCATED;
> -
> -	memcg_deactivate_kmem_caches(memcg);
> -
> -	kmemcg_id = memcg->kmemcg_id;
> -	BUG_ON(kmemcg_id < 0);
> -
> -	parent = parent_mem_cgroup(memcg);
> -	if (!parent)
> -		parent = root_mem_cgroup;
> -
> -	/*
> -	 * Change kmemcg_id of this cgroup and all its descendants to the
> -	 * parent's id, and then move all entries from this cgroup's list_lrus
> -	 * to ones of the parent. After we have finished, all list_lrus
> -	 * corresponding to this cgroup are guaranteed to remain empty. The
> -	 * ordering is imposed by list_lru_node->lock taken by
> -	 * memcg_drain_all_list_lrus().
> -	 */
> -	css_for_each_descendant_pre(css, &memcg->css) {
> -		child = mem_cgroup_from_css(css);
> -		BUG_ON(child->kmemcg_id != kmemcg_id);
> -		child->kmemcg_id = parent->kmemcg_id;
> -		if (!memcg->use_hierarchy)
> -			break;
> -	}
> -	memcg_drain_all_list_lrus(kmemcg_id, parent->kmemcg_id);
> -
> -	memcg_free_cache_id(kmemcg_id);
> -}
> -
> -static void memcg_free_kmem(struct mem_cgroup *memcg)
> -{
> -	if (memcg->kmem_state == KMEM_ALLOCATED) {
> -		memcg_destroy_kmem_caches(memcg);
> -		static_branch_dec(&memcg_kmem_enabled_key);
> -		WARN_ON(page_counter_read(&memcg->kmem));
> -	}
> -	tcp_destroy_cgroup(memcg);
> -}
> -#else
> -static int memcg_init_kmem(struct mem_cgroup *memcg, struct cgroup_subsys *ss)
> -{
> -	return 0;
> -}
> -
> -static void memcg_offline_kmem(struct mem_cgroup *memcg)
> -{
> -}
> -
> -static void memcg_free_kmem(struct mem_cgroup *memcg)
> -{
> -}
> -#endif
> -
>  #ifdef CONFIG_CGROUP_WRITEBACK
>  
>  struct list_head *mem_cgroup_cgwb_list(struct mem_cgroup *memcg)
> -- 
> 2.6.3

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 5/8] mm: memcontrol: separate kmem code from legacy tcp accounting code
  2015-12-08 18:34   ` Johannes Weiner
  (?)
@ 2015-12-10 12:59     ` Michal Hocko
  -1 siblings, 0 replies; 79+ messages in thread
From: Michal Hocko @ 2015-12-10 12:59 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Vladimir Davydov, linux-mm, cgroups, linux-kernel,
	kernel-team

On Tue 08-12-15 13:34:22, Johannes Weiner wrote:
> The cgroup2 memory controller will include important in-kernel memory
> consumers per default, including socket memory, but it will no longer
> carry the historic tcp control interface.
> 
> Separate the kmem state init from the tcp control interface init in
> preparation for that.
> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  mm/memcontrol.c | 33 ++++++++++++---------------------
>  1 file changed, 12 insertions(+), 21 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 5118618..55a3f07 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2925,17 +2925,6 @@ static int memcg_propagate_kmem(struct mem_cgroup *memcg)
>  	return ret;
>  }
>  
> -static int memcg_init_kmem(struct mem_cgroup *memcg)
> -{
> -	int ret;
> -
> -	ret = memcg_propagate_kmem(memcg);
> -	if (ret)
> -		return ret;
> -
> -	return tcp_init_cgroup(memcg);
> -}
> -
>  static void memcg_offline_kmem(struct mem_cgroup *memcg)
>  {
>  	struct cgroup_subsys_state *css;
> @@ -2988,7 +2977,6 @@ static void memcg_free_kmem(struct mem_cgroup *memcg)
>  		static_branch_dec(&memcg_kmem_enabled_key);
>  		WARN_ON(page_counter_read(&memcg->kmem));
>  	}
> -	tcp_destroy_cgroup(memcg);
>  }
>  #else
>  static int memcg_update_kmem_limit(struct mem_cgroup *memcg,
> @@ -2996,16 +2984,9 @@ static int memcg_update_kmem_limit(struct mem_cgroup *memcg,
>  {
>  	return -EINVAL;
>  }
> -static int memcg_init_kmem(struct mem_cgroup *memcg)
> -{
> -	return 0;
> -}
>  static void memcg_offline_kmem(struct mem_cgroup *memcg)
>  {
>  }
> -static void memcg_free_kmem(struct mem_cgroup *memcg)
> -{
> -}
>  #endif /* CONFIG_MEMCG_KMEM */
>  
>  /*
> @@ -4241,9 +4222,14 @@ mem_cgroup_css_online(struct cgroup_subsys_state *css)
>  	}
>  	mutex_unlock(&memcg_create_mutex);
>  
> -	ret = memcg_init_kmem(memcg);
> +#ifdef CONFIG_MEMCG_KMEM
> +	ret = memcg_propagate_kmem(memcg);
>  	if (ret)
>  		return ret;
> +	ret = tcp_init_cgroup(memcg);
> +	if (ret)
> +		return ret;
> +#endif
>  
>  #ifdef CONFIG_INET
>  	if (cgroup_subsys_on_dfl(memory_cgrp_subsys) && !cgroup_memory_nosocket)
> @@ -4288,11 +4274,16 @@ static void mem_cgroup_css_free(struct cgroup_subsys_state *css)
>  {
>  	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
>  
> -	memcg_free_kmem(memcg);
>  #ifdef CONFIG_INET
>  	if (cgroup_subsys_on_dfl(memory_cgrp_subsys) && !cgroup_memory_nosocket)
>  		static_branch_dec(&memcg_sockets_enabled_key);
>  #endif
> +
> +#ifdef CONFIG_MEMCG_KMEM
> +	memcg_free_kmem(memcg);
> +	tcp_destroy_cgroup(memcg);
> +#endif
> +
>  	__mem_cgroup_free(memcg);
>  }
>  
> -- 
> 2.6.3

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 5/8] mm: memcontrol: separate kmem code from legacy tcp accounting code
@ 2015-12-10 12:59     ` Michal Hocko
  0 siblings, 0 replies; 79+ messages in thread
From: Michal Hocko @ 2015-12-10 12:59 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Vladimir Davydov, linux-mm, cgroups, linux-kernel,
	kernel-team

On Tue 08-12-15 13:34:22, Johannes Weiner wrote:
> The cgroup2 memory controller will include important in-kernel memory
> consumers per default, including socket memory, but it will no longer
> carry the historic tcp control interface.
> 
> Separate the kmem state init from the tcp control interface init in
> preparation for that.
> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  mm/memcontrol.c | 33 ++++++++++++---------------------
>  1 file changed, 12 insertions(+), 21 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 5118618..55a3f07 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2925,17 +2925,6 @@ static int memcg_propagate_kmem(struct mem_cgroup *memcg)
>  	return ret;
>  }
>  
> -static int memcg_init_kmem(struct mem_cgroup *memcg)
> -{
> -	int ret;
> -
> -	ret = memcg_propagate_kmem(memcg);
> -	if (ret)
> -		return ret;
> -
> -	return tcp_init_cgroup(memcg);
> -}
> -
>  static void memcg_offline_kmem(struct mem_cgroup *memcg)
>  {
>  	struct cgroup_subsys_state *css;
> @@ -2988,7 +2977,6 @@ static void memcg_free_kmem(struct mem_cgroup *memcg)
>  		static_branch_dec(&memcg_kmem_enabled_key);
>  		WARN_ON(page_counter_read(&memcg->kmem));
>  	}
> -	tcp_destroy_cgroup(memcg);
>  }
>  #else
>  static int memcg_update_kmem_limit(struct mem_cgroup *memcg,
> @@ -2996,16 +2984,9 @@ static int memcg_update_kmem_limit(struct mem_cgroup *memcg,
>  {
>  	return -EINVAL;
>  }
> -static int memcg_init_kmem(struct mem_cgroup *memcg)
> -{
> -	return 0;
> -}
>  static void memcg_offline_kmem(struct mem_cgroup *memcg)
>  {
>  }
> -static void memcg_free_kmem(struct mem_cgroup *memcg)
> -{
> -}
>  #endif /* CONFIG_MEMCG_KMEM */
>  
>  /*
> @@ -4241,9 +4222,14 @@ mem_cgroup_css_online(struct cgroup_subsys_state *css)
>  	}
>  	mutex_unlock(&memcg_create_mutex);
>  
> -	ret = memcg_init_kmem(memcg);
> +#ifdef CONFIG_MEMCG_KMEM
> +	ret = memcg_propagate_kmem(memcg);
>  	if (ret)
>  		return ret;
> +	ret = tcp_init_cgroup(memcg);
> +	if (ret)
> +		return ret;
> +#endif
>  
>  #ifdef CONFIG_INET
>  	if (cgroup_subsys_on_dfl(memory_cgrp_subsys) && !cgroup_memory_nosocket)
> @@ -4288,11 +4274,16 @@ static void mem_cgroup_css_free(struct cgroup_subsys_state *css)
>  {
>  	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
>  
> -	memcg_free_kmem(memcg);
>  #ifdef CONFIG_INET
>  	if (cgroup_subsys_on_dfl(memory_cgrp_subsys) && !cgroup_memory_nosocket)
>  		static_branch_dec(&memcg_sockets_enabled_key);
>  #endif
> +
> +#ifdef CONFIG_MEMCG_KMEM
> +	memcg_free_kmem(memcg);
> +	tcp_destroy_cgroup(memcg);
> +#endif
> +
>  	__mem_cgroup_free(memcg);
>  }
>  
> -- 
> 2.6.3

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 5/8] mm: memcontrol: separate kmem code from legacy tcp accounting code
@ 2015-12-10 12:59     ` Michal Hocko
  0 siblings, 0 replies; 79+ messages in thread
From: Michal Hocko @ 2015-12-10 12:59 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Vladimir Davydov, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, kernel-team-b10kYP2dOMg

On Tue 08-12-15 13:34:22, Johannes Weiner wrote:
> The cgroup2 memory controller will include important in-kernel memory
> consumers per default, including socket memory, but it will no longer
> carry the historic tcp control interface.
> 
> Separate the kmem state init from the tcp control interface init in
> preparation for that.
> 
> Signed-off-by: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>

Acked-by: Michal Hocko <mhocko-IBi9RG/b67k@public.gmane.org>

> ---
>  mm/memcontrol.c | 33 ++++++++++++---------------------
>  1 file changed, 12 insertions(+), 21 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 5118618..55a3f07 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2925,17 +2925,6 @@ static int memcg_propagate_kmem(struct mem_cgroup *memcg)
>  	return ret;
>  }
>  
> -static int memcg_init_kmem(struct mem_cgroup *memcg)
> -{
> -	int ret;
> -
> -	ret = memcg_propagate_kmem(memcg);
> -	if (ret)
> -		return ret;
> -
> -	return tcp_init_cgroup(memcg);
> -}
> -
>  static void memcg_offline_kmem(struct mem_cgroup *memcg)
>  {
>  	struct cgroup_subsys_state *css;
> @@ -2988,7 +2977,6 @@ static void memcg_free_kmem(struct mem_cgroup *memcg)
>  		static_branch_dec(&memcg_kmem_enabled_key);
>  		WARN_ON(page_counter_read(&memcg->kmem));
>  	}
> -	tcp_destroy_cgroup(memcg);
>  }
>  #else
>  static int memcg_update_kmem_limit(struct mem_cgroup *memcg,
> @@ -2996,16 +2984,9 @@ static int memcg_update_kmem_limit(struct mem_cgroup *memcg,
>  {
>  	return -EINVAL;
>  }
> -static int memcg_init_kmem(struct mem_cgroup *memcg)
> -{
> -	return 0;
> -}
>  static void memcg_offline_kmem(struct mem_cgroup *memcg)
>  {
>  }
> -static void memcg_free_kmem(struct mem_cgroup *memcg)
> -{
> -}
>  #endif /* CONFIG_MEMCG_KMEM */
>  
>  /*
> @@ -4241,9 +4222,14 @@ mem_cgroup_css_online(struct cgroup_subsys_state *css)
>  	}
>  	mutex_unlock(&memcg_create_mutex);
>  
> -	ret = memcg_init_kmem(memcg);
> +#ifdef CONFIG_MEMCG_KMEM
> +	ret = memcg_propagate_kmem(memcg);
>  	if (ret)
>  		return ret;
> +	ret = tcp_init_cgroup(memcg);
> +	if (ret)
> +		return ret;
> +#endif
>  
>  #ifdef CONFIG_INET
>  	if (cgroup_subsys_on_dfl(memory_cgrp_subsys) && !cgroup_memory_nosocket)
> @@ -4288,11 +4274,16 @@ static void mem_cgroup_css_free(struct cgroup_subsys_state *css)
>  {
>  	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
>  
> -	memcg_free_kmem(memcg);
>  #ifdef CONFIG_INET
>  	if (cgroup_subsys_on_dfl(memory_cgrp_subsys) && !cgroup_memory_nosocket)
>  		static_branch_dec(&memcg_sockets_enabled_key);
>  #endif
> +
> +#ifdef CONFIG_MEMCG_KMEM
> +	memcg_free_kmem(memcg);
> +	tcp_destroy_cgroup(memcg);
> +#endif
> +
>  	__mem_cgroup_free(memcg);
>  }
>  
> -- 
> 2.6.3

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 6/8] mm: memcontrol: move kmem accounting code to CONFIG_MEMCG
  2015-12-08 18:34   ` Johannes Weiner
@ 2015-12-10 13:17     ` Michal Hocko
  -1 siblings, 0 replies; 79+ messages in thread
From: Michal Hocko @ 2015-12-10 13:17 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Vladimir Davydov, linux-mm, cgroups, linux-kernel,
	kernel-team

On Tue 08-12-15 13:34:23, Johannes Weiner wrote:
> The cgroup2 memory controller will account important in-kernel memory
> consumers per default. Move all necessary components to CONFIG_MEMCG.

Hmm, that bloats the kernel also for users who are not using cgroup2
and have CONFIG_MEMCG_KMEM disabled.

This is the situation before this patch
   text    data     bss     dec     hex filename
 521342   97516   44312  663170   a1e82 mm/built-in.o.kmem
 513349   96299   43960  653608   9f928 mm/built-in.o.nokmem

and after with CONFIG_MEMCG_KMEM=n

 521028   96556   44312  661896   a1988 mm/built-in.o

we are basically back to CONFIG_MEMCG_KMEM=y. This sounds like a wastage
to me. Do we really need this?

> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
[...]
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 6/8] mm: memcontrol: move kmem accounting code to CONFIG_MEMCG
@ 2015-12-10 13:17     ` Michal Hocko
  0 siblings, 0 replies; 79+ messages in thread
From: Michal Hocko @ 2015-12-10 13:17 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Vladimir Davydov, linux-mm, cgroups, linux-kernel,
	kernel-team

On Tue 08-12-15 13:34:23, Johannes Weiner wrote:
> The cgroup2 memory controller will account important in-kernel memory
> consumers per default. Move all necessary components to CONFIG_MEMCG.

Hmm, that bloats the kernel also for users who are not using cgroup2
and have CONFIG_MEMCG_KMEM disabled.

This is the situation before this patch
   text    data     bss     dec     hex filename
 521342   97516   44312  663170   a1e82 mm/built-in.o.kmem
 513349   96299   43960  653608   9f928 mm/built-in.o.nokmem

and after with CONFIG_MEMCG_KMEM=n

 521028   96556   44312  661896   a1988 mm/built-in.o

we are basically back to CONFIG_MEMCG_KMEM=y. This sounds like a wastage
to me. Do we really need this?

> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
[...]
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 7/8] mm: memcontrol: account "kmem" consumers in cgroup2 memory controller
  2015-12-09 11:30     ` Vladimir Davydov
  (?)
@ 2015-12-10 13:28       ` Michal Hocko
  -1 siblings, 0 replies; 79+ messages in thread
From: Michal Hocko @ 2015-12-10 13:28 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: Johannes Weiner, Andrew Morton, linux-mm, cgroups, linux-kernel,
	kernel-team

On Wed 09-12-15 14:30:38, Vladimir Davydov wrote:
> From: Vladimir Davydov <vdavydov@virtuozzo.com>
> Subject: [PATCH] mm: memcontrol: allow to disable kmem accounting for cgroup2
> 
> Kmem accounting might incur overhead that some users can't put up with.
> Besides, the implementation is still considered unstable. So let's
> provide a way to disable it for those users who aren't happy with it.

Yes there will be users who do not want to pay an additional overhead
and still accoplish what they need.
I haven't measured the overhead lately - especially after the opt-out ->
opt-in change so it might be much lower than my previous ~5% for kbuild
load.
 
> To disable kmem accounting for cgroup2, pass cgroup.memory=nokmem at
> boot time.
> 
> Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>

Acked-by: Michal Hocko <mhocko@suse.com>

Thanks!

> 
> diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
> index c1bda3bbb7db..1b7a85dc6013 100644
> --- a/Documentation/kernel-parameters.txt
> +++ b/Documentation/kernel-parameters.txt
> @@ -602,6 +602,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
>  	cgroup.memory=	[KNL] Pass options to the cgroup memory controller.
>  			Format: <string>
>  			nosocket -- Disable socket memory accounting.
> +			nokmem -- Disable kernel memory accounting.
>  
>  	checkreqprot	[SELINUX] Set initial checkreqprot flag value.
>  			Format: { "0" | "1" }
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 6faea81e66d7..6a5572241dc6 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -83,6 +83,9 @@ struct mem_cgroup *root_mem_cgroup __read_mostly;
>  /* Socket memory accounting disabled? */
>  static bool cgroup_memory_nosocket;
>  
> +/* Kernel memory accounting disabled? */
> +static bool cgroup_memory_nokmem;
> +
>  /* Whether the swap controller is active */
>  #ifdef CONFIG_MEMCG_SWAP
>  int do_swap_account __read_mostly;
> @@ -2898,8 +2901,8 @@ static int memcg_propagate_kmem(struct mem_cgroup *memcg)
>  	 * onlined after this point, because it has at least one child
>  	 * already.
>  	 */
> -	if (cgroup_subsys_on_dfl(memory_cgrp_subsys) ||
> -	    memcg_kmem_online(parent))
> +	if (memcg_kmem_online(parent) ||
> +	    (cgroup_subsys_on_dfl(memory_cgrp_subsys) && !cgroup_memory_nokmem))
>  		ret = memcg_online_kmem(memcg);
>  	mutex_unlock(&memcg_limit_mutex);
>  	return ret;
> @@ -5587,6 +5590,8 @@ static int __init cgroup_memory(char *s)
>  			continue;
>  		if (!strcmp(token, "nosocket"))
>  			cgroup_memory_nosocket = true;
> +		if (!strcmp(token, "nokmem"))
> +			cgroup_memory_nokmem = true;
>  	}
>  	return 0;
>  }
> --
> To unsubscribe from this list: send the line "unsubscribe cgroups" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 7/8] mm: memcontrol: account "kmem" consumers in cgroup2 memory controller
@ 2015-12-10 13:28       ` Michal Hocko
  0 siblings, 0 replies; 79+ messages in thread
From: Michal Hocko @ 2015-12-10 13:28 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: Johannes Weiner, Andrew Morton, linux-mm, cgroups, linux-kernel,
	kernel-team

On Wed 09-12-15 14:30:38, Vladimir Davydov wrote:
> From: Vladimir Davydov <vdavydov@virtuozzo.com>
> Subject: [PATCH] mm: memcontrol: allow to disable kmem accounting for cgroup2
> 
> Kmem accounting might incur overhead that some users can't put up with.
> Besides, the implementation is still considered unstable. So let's
> provide a way to disable it for those users who aren't happy with it.

Yes there will be users who do not want to pay an additional overhead
and still accoplish what they need.
I haven't measured the overhead lately - especially after the opt-out ->
opt-in change so it might be much lower than my previous ~5% for kbuild
load.
 
> To disable kmem accounting for cgroup2, pass cgroup.memory=nokmem at
> boot time.
> 
> Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>

Acked-by: Michal Hocko <mhocko@suse.com>

Thanks!

> 
> diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
> index c1bda3bbb7db..1b7a85dc6013 100644
> --- a/Documentation/kernel-parameters.txt
> +++ b/Documentation/kernel-parameters.txt
> @@ -602,6 +602,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
>  	cgroup.memory=	[KNL] Pass options to the cgroup memory controller.
>  			Format: <string>
>  			nosocket -- Disable socket memory accounting.
> +			nokmem -- Disable kernel memory accounting.
>  
>  	checkreqprot	[SELINUX] Set initial checkreqprot flag value.
>  			Format: { "0" | "1" }
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 6faea81e66d7..6a5572241dc6 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -83,6 +83,9 @@ struct mem_cgroup *root_mem_cgroup __read_mostly;
>  /* Socket memory accounting disabled? */
>  static bool cgroup_memory_nosocket;
>  
> +/* Kernel memory accounting disabled? */
> +static bool cgroup_memory_nokmem;
> +
>  /* Whether the swap controller is active */
>  #ifdef CONFIG_MEMCG_SWAP
>  int do_swap_account __read_mostly;
> @@ -2898,8 +2901,8 @@ static int memcg_propagate_kmem(struct mem_cgroup *memcg)
>  	 * onlined after this point, because it has at least one child
>  	 * already.
>  	 */
> -	if (cgroup_subsys_on_dfl(memory_cgrp_subsys) ||
> -	    memcg_kmem_online(parent))
> +	if (memcg_kmem_online(parent) ||
> +	    (cgroup_subsys_on_dfl(memory_cgrp_subsys) && !cgroup_memory_nokmem))
>  		ret = memcg_online_kmem(memcg);
>  	mutex_unlock(&memcg_limit_mutex);
>  	return ret;
> @@ -5587,6 +5590,8 @@ static int __init cgroup_memory(char *s)
>  			continue;
>  		if (!strcmp(token, "nosocket"))
>  			cgroup_memory_nosocket = true;
> +		if (!strcmp(token, "nokmem"))
> +			cgroup_memory_nokmem = true;
>  	}
>  	return 0;
>  }
> --
> To unsubscribe from this list: send the line "unsubscribe cgroups" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 7/8] mm: memcontrol: account "kmem" consumers in cgroup2 memory controller
@ 2015-12-10 13:28       ` Michal Hocko
  0 siblings, 0 replies; 79+ messages in thread
From: Michal Hocko @ 2015-12-10 13:28 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: Johannes Weiner, Andrew Morton, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, kernel-team-b10kYP2dOMg

On Wed 09-12-15 14:30:38, Vladimir Davydov wrote:
> From: Vladimir Davydov <vdavydov-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>
> Subject: [PATCH] mm: memcontrol: allow to disable kmem accounting for cgroup2
> 
> Kmem accounting might incur overhead that some users can't put up with.
> Besides, the implementation is still considered unstable. So let's
> provide a way to disable it for those users who aren't happy with it.

Yes there will be users who do not want to pay an additional overhead
and still accoplish what they need.
I haven't measured the overhead lately - especially after the opt-out ->
opt-in change so it might be much lower than my previous ~5% for kbuild
load.
 
> To disable kmem accounting for cgroup2, pass cgroup.memory=nokmem at
> boot time.
> 
> Signed-off-by: Vladimir Davydov <vdavydov-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>

Acked-by: Michal Hocko <mhocko-IBi9RG/b67k@public.gmane.org>

Thanks!

> 
> diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
> index c1bda3bbb7db..1b7a85dc6013 100644
> --- a/Documentation/kernel-parameters.txt
> +++ b/Documentation/kernel-parameters.txt
> @@ -602,6 +602,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
>  	cgroup.memory=	[KNL] Pass options to the cgroup memory controller.
>  			Format: <string>
>  			nosocket -- Disable socket memory accounting.
> +			nokmem -- Disable kernel memory accounting.
>  
>  	checkreqprot	[SELINUX] Set initial checkreqprot flag value.
>  			Format: { "0" | "1" }
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 6faea81e66d7..6a5572241dc6 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -83,6 +83,9 @@ struct mem_cgroup *root_mem_cgroup __read_mostly;
>  /* Socket memory accounting disabled? */
>  static bool cgroup_memory_nosocket;
>  
> +/* Kernel memory accounting disabled? */
> +static bool cgroup_memory_nokmem;
> +
>  /* Whether the swap controller is active */
>  #ifdef CONFIG_MEMCG_SWAP
>  int do_swap_account __read_mostly;
> @@ -2898,8 +2901,8 @@ static int memcg_propagate_kmem(struct mem_cgroup *memcg)
>  	 * onlined after this point, because it has at least one child
>  	 * already.
>  	 */
> -	if (cgroup_subsys_on_dfl(memory_cgrp_subsys) ||
> -	    memcg_kmem_online(parent))
> +	if (memcg_kmem_online(parent) ||
> +	    (cgroup_subsys_on_dfl(memory_cgrp_subsys) && !cgroup_memory_nokmem))
>  		ret = memcg_online_kmem(memcg);
>  	mutex_unlock(&memcg_limit_mutex);
>  	return ret;
> @@ -5587,6 +5590,8 @@ static int __init cgroup_memory(char *s)
>  			continue;
>  		if (!strcmp(token, "nosocket"))
>  			cgroup_memory_nosocket = true;
> +		if (!strcmp(token, "nokmem"))
> +			cgroup_memory_nokmem = true;
>  	}
>  	return 0;
>  }
> --
> To unsubscribe from this list: send the line "unsubscribe cgroups" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 6/8] mm: memcontrol: move kmem accounting code to CONFIG_MEMCG
  2015-12-10 13:17     ` Michal Hocko
  (?)
@ 2015-12-10 14:00       ` Johannes Weiner
  -1 siblings, 0 replies; 79+ messages in thread
From: Johannes Weiner @ 2015-12-10 14:00 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, Vladimir Davydov, linux-mm, cgroups, linux-kernel,
	kernel-team

On Thu, Dec 10, 2015 at 02:17:18PM +0100, Michal Hocko wrote:
> On Tue 08-12-15 13:34:23, Johannes Weiner wrote:
> > The cgroup2 memory controller will account important in-kernel memory
> > consumers per default. Move all necessary components to CONFIG_MEMCG.
> 
> Hmm, that bloats the kernel also for users who are not using cgroup2
> and have CONFIG_MEMCG_KMEM disabled.

Huh? The clock is ticking for them. We'll keep the original cgroup
interface around for backwards compatibility as long as we think there
are activer users, but this is not the time to microoptimize v1. A
slight increase to kernel size is unfortunate, but I don't think this
could be considered a regression in the sense that it breaks anything.

I'm more concerned with cgroup2 users having to pay the excessive cost
of v1-only features, like the entire soft limit implementation, charge
migration, which also has its fangs in VM hotpaths, the eventfd stuff,
arbitrarily-configurable usage thresholds. That's a *ton* of code. If
you want to tackle kernel bloat, grouping this stuff all together and
slapping a CONFIG_MEMCG_LEGACY around it would be a real step forward.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 6/8] mm: memcontrol: move kmem accounting code to CONFIG_MEMCG
@ 2015-12-10 14:00       ` Johannes Weiner
  0 siblings, 0 replies; 79+ messages in thread
From: Johannes Weiner @ 2015-12-10 14:00 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, Vladimir Davydov, linux-mm, cgroups, linux-kernel,
	kernel-team

On Thu, Dec 10, 2015 at 02:17:18PM +0100, Michal Hocko wrote:
> On Tue 08-12-15 13:34:23, Johannes Weiner wrote:
> > The cgroup2 memory controller will account important in-kernel memory
> > consumers per default. Move all necessary components to CONFIG_MEMCG.
> 
> Hmm, that bloats the kernel also for users who are not using cgroup2
> and have CONFIG_MEMCG_KMEM disabled.

Huh? The clock is ticking for them. We'll keep the original cgroup
interface around for backwards compatibility as long as we think there
are activer users, but this is not the time to microoptimize v1. A
slight increase to kernel size is unfortunate, but I don't think this
could be considered a regression in the sense that it breaks anything.

I'm more concerned with cgroup2 users having to pay the excessive cost
of v1-only features, like the entire soft limit implementation, charge
migration, which also has its fangs in VM hotpaths, the eventfd stuff,
arbitrarily-configurable usage thresholds. That's a *ton* of code. If
you want to tackle kernel bloat, grouping this stuff all together and
slapping a CONFIG_MEMCG_LEGACY around it would be a real step forward.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 6/8] mm: memcontrol: move kmem accounting code to CONFIG_MEMCG
@ 2015-12-10 14:00       ` Johannes Weiner
  0 siblings, 0 replies; 79+ messages in thread
From: Johannes Weiner @ 2015-12-10 14:00 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, Vladimir Davydov, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, kernel-team-b10kYP2dOMg

On Thu, Dec 10, 2015 at 02:17:18PM +0100, Michal Hocko wrote:
> On Tue 08-12-15 13:34:23, Johannes Weiner wrote:
> > The cgroup2 memory controller will account important in-kernel memory
> > consumers per default. Move all necessary components to CONFIG_MEMCG.
> 
> Hmm, that bloats the kernel also for users who are not using cgroup2
> and have CONFIG_MEMCG_KMEM disabled.

Huh? The clock is ticking for them. We'll keep the original cgroup
interface around for backwards compatibility as long as we think there
are activer users, but this is not the time to microoptimize v1. A
slight increase to kernel size is unfortunate, but I don't think this
could be considered a regression in the sense that it breaks anything.

I'm more concerned with cgroup2 users having to pay the excessive cost
of v1-only features, like the entire soft limit implementation, charge
migration, which also has its fangs in VM hotpaths, the eventfd stuff,
arbitrarily-configurable usage thresholds. That's a *ton* of code. If
you want to tackle kernel bloat, grouping this stuff all together and
slapping a CONFIG_MEMCG_LEGACY around it would be a real step forward.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 7/8] mm: memcontrol: account "kmem" consumers in cgroup2 memory controller
  2015-12-08 18:34   ` Johannes Weiner
@ 2015-12-10 14:21     ` Michal Hocko
  -1 siblings, 0 replies; 79+ messages in thread
From: Michal Hocko @ 2015-12-10 14:21 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Vladimir Davydov, linux-mm, cgroups, linux-kernel,
	kernel-team

On Tue 08-12-15 13:34:24, Johannes Weiner wrote:
> The original cgroup memory controller has an extension to account slab
> memory (and other "kernel memory" consumers) in a separate "kmem"
> counter, once the user set an explicit limit on that "kmem" pool.
> 
> However, this includes various consumers whose sizes are directly
> linked to userspace activity. Accounting them as an optional "kmem"
> extension is problematic for several reasons:
> 
> 1. It leaves the main memory interface with incomplete semantics. A
>    user who puts their workload into a cgroup and configures a memory
>    limit does not expect us to leave holes in the containment as big
>    as the dentry and inode cache, or the kernel stack pages.
> 
> 2. If the limit set on this random historical subgroup of consumers is
>    reached, subsequent allocations will fail even when the main memory
>    pool available to the cgroup is not yet exhausted and/or has
>    reclaimable memory in it.
> 
> 3. Calling it 'kernel memory' is misleading. The dentry and inode
>    caches are no more 'kernel' (or no less 'user') memory than the
>    page cache itself. Treating these consumers as different classes is
>    a historical implementation detail that should not leak to users.
> 
> So, in addition to page cache, anonymous memory, and network socket
> memory, account the following memory consumers per default in the
> cgroup2 memory controller:
> 
>      - threadinfo
>      - task_struct
>      - task_delay_info
>      - pid
>      - cred
>      - mm_struct
>      - vm_area_struct and vm_region (nommu)
>      - anon_vma and anon_vma_chain
>      - signal_struct
>      - sighand_struct
>      - fs_struct
>      - files_struct
>      - fdtable and fdtable->full_fds_bits
>      - dentry and external_name
>      - inode for all filesystems.
> 
> This should give us reasonable memory isolation for most common
> workloads out of the box.
> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  mm/memcontrol.c | 18 +++++++++++-------
>  1 file changed, 11 insertions(+), 7 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index ab72c47..d048137 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2356,13 +2356,14 @@ int __memcg_kmem_charge_memcg(struct page *page, gfp_t gfp, int order,
>  	if (!memcg_kmem_online(memcg))
>  		return 0;
>  
> -	if (!page_counter_try_charge(&memcg->kmem, nr_pages, &counter))
> -		return -ENOMEM;
> -
>  	ret = try_charge(memcg, gfp, nr_pages);
> -	if (ret) {
> -		page_counter_uncharge(&memcg->kmem, nr_pages);
> +	if (ret)
>  		return ret;
> +
> +	if (!cgroup_subsys_on_dfl(memory_cgrp_subsys) &&
> +	    !page_counter_try_charge(&memcg->kmem, nr_pages, &counter)) {
> +		cancel_charge(memcg, nr_pages);
> +		return -ENOMEM;
>  	}
>  
>  	page->mem_cgroup = memcg;
> @@ -2391,7 +2392,9 @@ void __memcg_kmem_uncharge(struct page *page, int order)
>  
>  	VM_BUG_ON_PAGE(mem_cgroup_is_root(memcg), page);
>  
> -	page_counter_uncharge(&memcg->kmem, nr_pages);
> +	if (!cgroup_subsys_on_dfl(memory_cgrp_subsys))
> +		page_counter_uncharge(&memcg->kmem, nr_pages);
> +
>  	page_counter_uncharge(&memcg->memory, nr_pages);
>  	if (do_memsw_account())
>  		page_counter_uncharge(&memcg->memsw, nr_pages);
> @@ -2895,7 +2898,8 @@ static int memcg_propagate_kmem(struct mem_cgroup *memcg)
>  	 * onlined after this point, because it has at least one child
>  	 * already.
>  	 */
> -	if (memcg_kmem_online(parent))
> +	if (cgroup_subsys_on_dfl(memory_cgrp_subsys) ||
> +	    memcg_kmem_online(parent))
>  		ret = memcg_online_kmem(memcg);
>  	mutex_unlock(&memcg_limit_mutex);
>  	return ret;
> -- 
> 2.6.3

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 7/8] mm: memcontrol: account "kmem" consumers in cgroup2 memory controller
@ 2015-12-10 14:21     ` Michal Hocko
  0 siblings, 0 replies; 79+ messages in thread
From: Michal Hocko @ 2015-12-10 14:21 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Vladimir Davydov, linux-mm, cgroups, linux-kernel,
	kernel-team

On Tue 08-12-15 13:34:24, Johannes Weiner wrote:
> The original cgroup memory controller has an extension to account slab
> memory (and other "kernel memory" consumers) in a separate "kmem"
> counter, once the user set an explicit limit on that "kmem" pool.
> 
> However, this includes various consumers whose sizes are directly
> linked to userspace activity. Accounting them as an optional "kmem"
> extension is problematic for several reasons:
> 
> 1. It leaves the main memory interface with incomplete semantics. A
>    user who puts their workload into a cgroup and configures a memory
>    limit does not expect us to leave holes in the containment as big
>    as the dentry and inode cache, or the kernel stack pages.
> 
> 2. If the limit set on this random historical subgroup of consumers is
>    reached, subsequent allocations will fail even when the main memory
>    pool available to the cgroup is not yet exhausted and/or has
>    reclaimable memory in it.
> 
> 3. Calling it 'kernel memory' is misleading. The dentry and inode
>    caches are no more 'kernel' (or no less 'user') memory than the
>    page cache itself. Treating these consumers as different classes is
>    a historical implementation detail that should not leak to users.
> 
> So, in addition to page cache, anonymous memory, and network socket
> memory, account the following memory consumers per default in the
> cgroup2 memory controller:
> 
>      - threadinfo
>      - task_struct
>      - task_delay_info
>      - pid
>      - cred
>      - mm_struct
>      - vm_area_struct and vm_region (nommu)
>      - anon_vma and anon_vma_chain
>      - signal_struct
>      - sighand_struct
>      - fs_struct
>      - files_struct
>      - fdtable and fdtable->full_fds_bits
>      - dentry and external_name
>      - inode for all filesystems.
> 
> This should give us reasonable memory isolation for most common
> workloads out of the box.
> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  mm/memcontrol.c | 18 +++++++++++-------
>  1 file changed, 11 insertions(+), 7 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index ab72c47..d048137 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2356,13 +2356,14 @@ int __memcg_kmem_charge_memcg(struct page *page, gfp_t gfp, int order,
>  	if (!memcg_kmem_online(memcg))
>  		return 0;
>  
> -	if (!page_counter_try_charge(&memcg->kmem, nr_pages, &counter))
> -		return -ENOMEM;
> -
>  	ret = try_charge(memcg, gfp, nr_pages);
> -	if (ret) {
> -		page_counter_uncharge(&memcg->kmem, nr_pages);
> +	if (ret)
>  		return ret;
> +
> +	if (!cgroup_subsys_on_dfl(memory_cgrp_subsys) &&
> +	    !page_counter_try_charge(&memcg->kmem, nr_pages, &counter)) {
> +		cancel_charge(memcg, nr_pages);
> +		return -ENOMEM;
>  	}
>  
>  	page->mem_cgroup = memcg;
> @@ -2391,7 +2392,9 @@ void __memcg_kmem_uncharge(struct page *page, int order)
>  
>  	VM_BUG_ON_PAGE(mem_cgroup_is_root(memcg), page);
>  
> -	page_counter_uncharge(&memcg->kmem, nr_pages);
> +	if (!cgroup_subsys_on_dfl(memory_cgrp_subsys))
> +		page_counter_uncharge(&memcg->kmem, nr_pages);
> +
>  	page_counter_uncharge(&memcg->memory, nr_pages);
>  	if (do_memsw_account())
>  		page_counter_uncharge(&memcg->memsw, nr_pages);
> @@ -2895,7 +2898,8 @@ static int memcg_propagate_kmem(struct mem_cgroup *memcg)
>  	 * onlined after this point, because it has at least one child
>  	 * already.
>  	 */
> -	if (memcg_kmem_online(parent))
> +	if (cgroup_subsys_on_dfl(memory_cgrp_subsys) ||
> +	    memcg_kmem_online(parent))
>  		ret = memcg_online_kmem(memcg);
>  	mutex_unlock(&memcg_limit_mutex);
>  	return ret;
> -- 
> 2.6.3

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 7/8] mm: memcontrol: account "kmem" consumers in cgroup2 memory controller
  2015-12-10 13:28       ` Michal Hocko
@ 2015-12-10 15:16         ` Johannes Weiner
  -1 siblings, 0 replies; 79+ messages in thread
From: Johannes Weiner @ 2015-12-10 15:16 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Vladimir Davydov, Andrew Morton, linux-mm, cgroups, linux-kernel,
	kernel-team

On Thu, Dec 10, 2015 at 02:28:33PM +0100, Michal Hocko wrote:
> On Wed 09-12-15 14:30:38, Vladimir Davydov wrote:
> > From: Vladimir Davydov <vdavydov@virtuozzo.com>
> > Subject: [PATCH] mm: memcontrol: allow to disable kmem accounting for cgroup2
> > 
> > Kmem accounting might incur overhead that some users can't put up with.
> > Besides, the implementation is still considered unstable. So let's
> > provide a way to disable it for those users who aren't happy with it.
> 
> Yes there will be users who do not want to pay an additional overhead
> and still accoplish what they need.
> I haven't measured the overhead lately - especially after the opt-out ->
> opt-in change so it might be much lower than my previous ~5% for kbuild
> load.

I think switching from accounting *all* slab allocations to accounting
a list of, what, less than 20 select slabs, counts as a change
significant enough to entirely invalidate those measurements and never
bring up that number again in the context of kmem cost, don't you think?

There isn't that much that the kmem is doing, but for posterity I ran
kbuild test inside a cgroup2, with and without cgroup.memory=nokmem,
and these are the results:

default:
 Performance counter stats for 'make -j16 -s clean bzImage' (3 runs):

     715823.047005      task-clock (msec)         #    3.794 CPUs utilized          
           252,538      context-switches          #    0.353 K/sec                  
            32,018      cpu-migrations            #    0.045 K/sec                  
        16,678,202      page-faults               #    0.023 M/sec                  
 1,783,804,914,980      cycles                    #    2.492 GHz                    
   <not supported>      stalled-cycles-frontend  
   <not supported>      stalled-cycles-backend   
 1,346,424,021,728      instructions              #    0.75  insns per cycle        
   298,744,956,474      branches                  #  417.363 M/sec                  
    10,207,872,737      branch-misses             #    3.42% of all branches        

     188.667608149 seconds time elapsed                                          ( +-  0.66% )

cgroup.memory=nokmem
 Performance counter stats for 'make -j16 -s clean bzImage' (3 runs):

     729028.322760      task-clock (msec)         #    3.805 CPUs utilized          
           258,775      context-switches          #    0.356 K/sec                  
            32,241      cpu-migrations            #    0.044 K/sec                  
        16,647,817      page-faults               #    0.023 M/sec                  
 1,816,827,061,194      cycles                    #    2.497 GHz                    
   <not supported>      stalled-cycles-frontend  
   <not supported>      stalled-cycles-backend   
 1,345,446,962,095      instructions              #    0.74  insns per cycle        
   298,461,034,326      branches                  #  410.277 M/sec                  
    10,215,145,963      branch-misses             #    3.42% of all branches        

     191.583957742 seconds time elapsed                                          ( +-  0.57% )

I would say the difference is solidly in the noise.

I also profiled a silly find | xargs stat pipe to excercise the dentry
and inode accounting, and this was the highest kmem-specific entry:

     0.27%     0.27%  find             [kernel.kallsyms]        [k] __memcg_kmem_get_cache                    
                       |
                       ---__memcg_kmem_get_cache
                          __kmalloc
                          ext4_htree_store_dirent
                          htree_dirblock_to_tree
                          ext4_htree_fill_tree
                          ext4_readdir
                          iterate_dir
                          sys_getdents
                          entry_SYSCALL_64_fastpath
                          __getdents64

So can we *please* lay this whole "unreasonable burden to legacy and
power users" line of argument to rest and get on with our work? And
then tackle scalability problems as they show up in real workloads?

Thanks.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 7/8] mm: memcontrol: account "kmem" consumers in cgroup2 memory controller
@ 2015-12-10 15:16         ` Johannes Weiner
  0 siblings, 0 replies; 79+ messages in thread
From: Johannes Weiner @ 2015-12-10 15:16 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Vladimir Davydov, Andrew Morton, linux-mm, cgroups, linux-kernel,
	kernel-team

On Thu, Dec 10, 2015 at 02:28:33PM +0100, Michal Hocko wrote:
> On Wed 09-12-15 14:30:38, Vladimir Davydov wrote:
> > From: Vladimir Davydov <vdavydov@virtuozzo.com>
> > Subject: [PATCH] mm: memcontrol: allow to disable kmem accounting for cgroup2
> > 
> > Kmem accounting might incur overhead that some users can't put up with.
> > Besides, the implementation is still considered unstable. So let's
> > provide a way to disable it for those users who aren't happy with it.
> 
> Yes there will be users who do not want to pay an additional overhead
> and still accoplish what they need.
> I haven't measured the overhead lately - especially after the opt-out ->
> opt-in change so it might be much lower than my previous ~5% for kbuild
> load.

I think switching from accounting *all* slab allocations to accounting
a list of, what, less than 20 select slabs, counts as a change
significant enough to entirely invalidate those measurements and never
bring up that number again in the context of kmem cost, don't you think?

There isn't that much that the kmem is doing, but for posterity I ran
kbuild test inside a cgroup2, with and without cgroup.memory=nokmem,
and these are the results:

default:
 Performance counter stats for 'make -j16 -s clean bzImage' (3 runs):

     715823.047005      task-clock (msec)         #    3.794 CPUs utilized          
           252,538      context-switches          #    0.353 K/sec                  
            32,018      cpu-migrations            #    0.045 K/sec                  
        16,678,202      page-faults               #    0.023 M/sec                  
 1,783,804,914,980      cycles                    #    2.492 GHz                    
   <not supported>      stalled-cycles-frontend  
   <not supported>      stalled-cycles-backend   
 1,346,424,021,728      instructions              #    0.75  insns per cycle        
   298,744,956,474      branches                  #  417.363 M/sec                  
    10,207,872,737      branch-misses             #    3.42% of all branches        

     188.667608149 seconds time elapsed                                          ( +-  0.66% )

cgroup.memory=nokmem
 Performance counter stats for 'make -j16 -s clean bzImage' (3 runs):

     729028.322760      task-clock (msec)         #    3.805 CPUs utilized          
           258,775      context-switches          #    0.356 K/sec                  
            32,241      cpu-migrations            #    0.044 K/sec                  
        16,647,817      page-faults               #    0.023 M/sec                  
 1,816,827,061,194      cycles                    #    2.497 GHz                    
   <not supported>      stalled-cycles-frontend  
   <not supported>      stalled-cycles-backend   
 1,345,446,962,095      instructions              #    0.74  insns per cycle        
   298,461,034,326      branches                  #  410.277 M/sec                  
    10,215,145,963      branch-misses             #    3.42% of all branches        

     191.583957742 seconds time elapsed                                          ( +-  0.57% )

I would say the difference is solidly in the noise.

I also profiled a silly find | xargs stat pipe to excercise the dentry
and inode accounting, and this was the highest kmem-specific entry:

     0.27%     0.27%  find             [kernel.kallsyms]        [k] __memcg_kmem_get_cache                    
                       |
                       ---__memcg_kmem_get_cache
                          __kmalloc
                          ext4_htree_store_dirent
                          htree_dirblock_to_tree
                          ext4_htree_fill_tree
                          ext4_readdir
                          iterate_dir
                          sys_getdents
                          entry_SYSCALL_64_fastpath
                          __getdents64

So can we *please* lay this whole "unreasonable burden to legacy and
power users" line of argument to rest and get on with our work? And
then tackle scalability problems as they show up in real workloads?

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 7/8] mm: memcontrol: account "kmem" consumers in cgroup2 memory controller
  2015-12-10 15:16         ` Johannes Weiner
  (?)
@ 2015-12-10 16:25           ` Michal Hocko
  -1 siblings, 0 replies; 79+ messages in thread
From: Michal Hocko @ 2015-12-10 16:25 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Vladimir Davydov, Andrew Morton, linux-mm, cgroups, linux-kernel,
	kernel-team

On Thu 10-12-15 10:16:27, Johannes Weiner wrote:
> On Thu, Dec 10, 2015 at 02:28:33PM +0100, Michal Hocko wrote:
> > On Wed 09-12-15 14:30:38, Vladimir Davydov wrote:
> > > From: Vladimir Davydov <vdavydov@virtuozzo.com>
> > > Subject: [PATCH] mm: memcontrol: allow to disable kmem accounting for cgroup2
> > > 
> > > Kmem accounting might incur overhead that some users can't put up with.
> > > Besides, the implementation is still considered unstable. So let's
> > > provide a way to disable it for those users who aren't happy with it.
> > 
> > Yes there will be users who do not want to pay an additional overhead
> > and still accoplish what they need.
> > I haven't measured the overhead lately - especially after the opt-out ->
> > opt-in change so it might be much lower than my previous ~5% for kbuild
> > load.
> 
> I think switching from accounting *all* slab allocations to accounting
> a list of, what, less than 20 select slabs, counts as a change
> significant enough to entirely invalidate those measurements and never
> bring up that number again in the context of kmem cost, don't you think?

Yes, as I've said the numbers are expected to be much lower. That is
one of the reasons I have acknowledged kmem enabled as a reasonable
default.  There will always be _special_ loads where numbers might look
differently, though, and having a disabling knob is a reasonable thing
to offer with a minimum maintenance overhead. And this is the argument
for the inclusion of the patch from Vladimir.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 7/8] mm: memcontrol: account "kmem" consumers in cgroup2 memory controller
@ 2015-12-10 16:25           ` Michal Hocko
  0 siblings, 0 replies; 79+ messages in thread
From: Michal Hocko @ 2015-12-10 16:25 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Vladimir Davydov, Andrew Morton, linux-mm, cgroups, linux-kernel,
	kernel-team

On Thu 10-12-15 10:16:27, Johannes Weiner wrote:
> On Thu, Dec 10, 2015 at 02:28:33PM +0100, Michal Hocko wrote:
> > On Wed 09-12-15 14:30:38, Vladimir Davydov wrote:
> > > From: Vladimir Davydov <vdavydov@virtuozzo.com>
> > > Subject: [PATCH] mm: memcontrol: allow to disable kmem accounting for cgroup2
> > > 
> > > Kmem accounting might incur overhead that some users can't put up with.
> > > Besides, the implementation is still considered unstable. So let's
> > > provide a way to disable it for those users who aren't happy with it.
> > 
> > Yes there will be users who do not want to pay an additional overhead
> > and still accoplish what they need.
> > I haven't measured the overhead lately - especially after the opt-out ->
> > opt-in change so it might be much lower than my previous ~5% for kbuild
> > load.
> 
> I think switching from accounting *all* slab allocations to accounting
> a list of, what, less than 20 select slabs, counts as a change
> significant enough to entirely invalidate those measurements and never
> bring up that number again in the context of kmem cost, don't you think?

Yes, as I've said the numbers are expected to be much lower. That is
one of the reasons I have acknowledged kmem enabled as a reasonable
default.  There will always be _special_ loads where numbers might look
differently, though, and having a disabling knob is a reasonable thing
to offer with a minimum maintenance overhead. And this is the argument
for the inclusion of the patch from Vladimir.

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 7/8] mm: memcontrol: account "kmem" consumers in cgroup2 memory controller
@ 2015-12-10 16:25           ` Michal Hocko
  0 siblings, 0 replies; 79+ messages in thread
From: Michal Hocko @ 2015-12-10 16:25 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Vladimir Davydov, Andrew Morton, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, kernel-team-b10kYP2dOMg

On Thu 10-12-15 10:16:27, Johannes Weiner wrote:
> On Thu, Dec 10, 2015 at 02:28:33PM +0100, Michal Hocko wrote:
> > On Wed 09-12-15 14:30:38, Vladimir Davydov wrote:
> > > From: Vladimir Davydov <vdavydov-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>
> > > Subject: [PATCH] mm: memcontrol: allow to disable kmem accounting for cgroup2
> > > 
> > > Kmem accounting might incur overhead that some users can't put up with.
> > > Besides, the implementation is still considered unstable. So let's
> > > provide a way to disable it for those users who aren't happy with it.
> > 
> > Yes there will be users who do not want to pay an additional overhead
> > and still accoplish what they need.
> > I haven't measured the overhead lately - especially after the opt-out ->
> > opt-in change so it might be much lower than my previous ~5% for kbuild
> > load.
> 
> I think switching from accounting *all* slab allocations to accounting
> a list of, what, less than 20 select slabs, counts as a change
> significant enough to entirely invalidate those measurements and never
> bring up that number again in the context of kmem cost, don't you think?

Yes, as I've said the numbers are expected to be much lower. That is
one of the reasons I have acknowledged kmem enabled as a reasonable
default.  There will always be _special_ loads where numbers might look
differently, though, and having a disabling knob is a reasonable thing
to offer with a minimum maintenance overhead. And this is the argument
for the inclusion of the patch from Vladimir.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH 6/8 v2] mm: memcontrol: move kmem accounting code to CONFIG_MEMCG
  2015-12-08 18:34   ` Johannes Weiner
  (?)
@ 2015-12-10 20:22     ` Johannes Weiner
  -1 siblings, 0 replies; 79+ messages in thread
From: Johannes Weiner @ 2015-12-10 20:22 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Michal Hocko, Vladimir Davydov, linux-mm, cgroups, linux-kernel,
	kernel-team, Arnd Bergmann

The cgroup2 memory controller will account important in-kernel memory
consumers per default. Move all necessary components to CONFIG_MEMCG.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---

Hi Andrew,

here is a drop-in replacement for what's in your tree to make slob
work with memcg again. It guards all the kmem-specific stuff with
CONFIG_MEMCG && !CONFIG_SLOB. A little lame but I figure it's not
worth introducing another level of indirection and go through the
trouble of finding a more meaningful symbol (CONFIG_MEMCG_KMEM is
taken, CONFIG_MEMCG_SLAB is ambiguous with SLAB vs. SLUB etc.).

So there.

The rest of my series applies fine on top, but if this creates fallout
in subsequent patches you have in your tree I'm happy to send refreshs.

Thanks again for your report, Arnd!

 include/linux/list_lru.h   |  4 +--
 include/linux/memcontrol.h |  7 +++--
 include/linux/sched.h      |  4 +--
 include/linux/slab_def.h   |  3 +-
 include/linux/slub_def.h   |  2 +-
 mm/list_lru.c              | 12 ++++----
 mm/memcontrol.c            | 69 +++++++++++++++++++++++++++-------------------
 mm/slab.h                  |  6 ++--
 mm/slab_common.c           | 10 +++----
 mm/slub.c                  | 10 +++----
 10 files changed, 71 insertions(+), 56 deletions(-)

diff --git a/include/linux/list_lru.h b/include/linux/list_lru.h
index 2a6b994..cb0ba9f 100644
--- a/include/linux/list_lru.h
+++ b/include/linux/list_lru.h
@@ -40,7 +40,7 @@ struct list_lru_node {
 	spinlock_t		lock;
 	/* global list, used for the root cgroup in cgroup aware lrus */
 	struct list_lru_one	lru;
-#ifdef CONFIG_MEMCG_KMEM
+#if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
 	/* for cgroup aware lrus points to per cgroup lists, otherwise NULL */
 	struct list_lru_memcg	*memcg_lrus;
 #endif
@@ -48,7 +48,7 @@ struct list_lru_node {
 
 struct list_lru {
 	struct list_lru_node	*node;
-#ifdef CONFIG_MEMCG_KMEM
+#if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
 	struct list_head	list;
 #endif
 };
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 54dab4d..a87704e 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -236,7 +236,7 @@ struct mem_cgroup {
 #if defined(CONFIG_MEMCG_KMEM) && defined(CONFIG_INET)
 	struct cg_proto tcp_mem;
 #endif
-#if defined(CONFIG_MEMCG_KMEM)
+#ifndef CONFIG_SLOB
         /* Index in the kmem_cache->memcg_params.memcg_caches array */
 	int kmemcg_id;
 	enum memcg_kmem_state kmem_state;
@@ -735,7 +735,7 @@ static inline bool mem_cgroup_under_socket_pressure(struct mem_cgroup *memcg)
 }
 #endif
 
-#ifdef CONFIG_MEMCG_KMEM
+#if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
 extern struct static_key_false memcg_kmem_enabled_key;
 
 extern int memcg_nr_cache_ids;
@@ -891,5 +891,6 @@ memcg_kmem_get_cache(struct kmem_cache *cachep, gfp_t gfp)
 static inline void memcg_kmem_put_cache(struct kmem_cache *cachep)
 {
 }
-#endif /* CONFIG_MEMCG_KMEM */
+#endif /* CONFIG_MEMCG && !CONFIG_SLOB */
+
 #endif /* _LINUX_MEMCONTROL_H */
diff --git a/include/linux/sched.h b/include/linux/sched.h
index edad7a4..ccf957e 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1465,10 +1465,10 @@ struct task_struct {
 	unsigned sched_migrated:1;
 #ifdef CONFIG_MEMCG
 	unsigned memcg_may_oom:1;
-#endif
-#ifdef CONFIG_MEMCG_KMEM
+#ifndef CONFIG_SLOB
 	unsigned memcg_kmem_skip_account:1;
 #endif
+#endif
 #ifdef CONFIG_COMPAT_BRK
 	unsigned brk_randomized:1;
 #endif
diff --git a/include/linux/slab_def.h b/include/linux/slab_def.h
index 33d0490..cf139d3 100644
--- a/include/linux/slab_def.h
+++ b/include/linux/slab_def.h
@@ -69,7 +69,8 @@ struct kmem_cache {
 	 */
 	int obj_offset;
 #endif /* CONFIG_DEBUG_SLAB */
-#ifdef CONFIG_MEMCG_KMEM
+
+#ifdef CONFIG_MEMCG
 	struct memcg_cache_params memcg_params;
 #endif
 
diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h
index 3388511..b7e57927 100644
--- a/include/linux/slub_def.h
+++ b/include/linux/slub_def.h
@@ -84,7 +84,7 @@ struct kmem_cache {
 #ifdef CONFIG_SYSFS
 	struct kobject kobj;	/* For sysfs */
 #endif
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
 	struct memcg_cache_params memcg_params;
 	int max_attr_size; /* for propagation, maximum size of a stored attr */
 #ifdef CONFIG_SYSFS
diff --git a/mm/list_lru.c b/mm/list_lru.c
index afc71ea..1d05cb9 100644
--- a/mm/list_lru.c
+++ b/mm/list_lru.c
@@ -12,7 +12,7 @@
 #include <linux/mutex.h>
 #include <linux/memcontrol.h>
 
-#ifdef CONFIG_MEMCG_KMEM
+#if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
 static LIST_HEAD(list_lrus);
 static DEFINE_MUTEX(list_lrus_mutex);
 
@@ -37,9 +37,9 @@ static void list_lru_register(struct list_lru *lru)
 static void list_lru_unregister(struct list_lru *lru)
 {
 }
-#endif /* CONFIG_MEMCG_KMEM */
+#endif /* CONFIG_MEMCG && !CONFIG_SLOB */
 
-#ifdef CONFIG_MEMCG_KMEM
+#if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
 static inline bool list_lru_memcg_aware(struct list_lru *lru)
 {
 	/*
@@ -104,7 +104,7 @@ list_lru_from_kmem(struct list_lru_node *nlru, void *ptr)
 {
 	return &nlru->lru;
 }
-#endif /* CONFIG_MEMCG_KMEM */
+#endif /* CONFIG_MEMCG && !CONFIG_SLOB */
 
 bool list_lru_add(struct list_lru *lru, struct list_head *item)
 {
@@ -292,7 +292,7 @@ static void init_one_lru(struct list_lru_one *l)
 	l->nr_items = 0;
 }
 
-#ifdef CONFIG_MEMCG_KMEM
+#if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
 static void __memcg_destroy_list_lru_node(struct list_lru_memcg *memcg_lrus,
 					  int begin, int end)
 {
@@ -529,7 +529,7 @@ static int memcg_init_list_lru(struct list_lru *lru, bool memcg_aware)
 static void memcg_destroy_list_lru(struct list_lru *lru)
 {
 }
-#endif /* CONFIG_MEMCG_KMEM */
+#endif /* CONFIG_MEMCG && !CONFIG_SLOB */
 
 int __list_lru_init(struct list_lru *lru, bool memcg_aware,
 		    struct lock_class_key *key)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 55a3f07..70e6fd1 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -297,7 +297,7 @@ static inline struct mem_cgroup *mem_cgroup_from_id(unsigned short id)
 	return mem_cgroup_from_css(css);
 }
 
-#ifdef CONFIG_MEMCG_KMEM
+#ifndef CONFIG_SLOB
 /*
  * This will be the memcg's index in each cache's ->memcg_params.memcg_caches.
  * The main reason for not using cgroup id for this:
@@ -349,7 +349,7 @@ void memcg_put_cache_ids(void)
 DEFINE_STATIC_KEY_FALSE(memcg_kmem_enabled_key);
 EXPORT_SYMBOL(memcg_kmem_enabled_key);
 
-#endif /* CONFIG_MEMCG_KMEM */
+#endif /* !CONFIG_SLOB */
 
 static struct mem_cgroup_per_zone *
 mem_cgroup_zone_zoneinfo(struct mem_cgroup *memcg, struct zone *zone)
@@ -2182,7 +2182,7 @@ static void commit_charge(struct page *page, struct mem_cgroup *memcg,
 		unlock_page_lru(page, isolated);
 }
 
-#ifdef CONFIG_MEMCG_KMEM
+#ifndef CONFIG_SLOB
 static int memcg_alloc_cache_id(void)
 {
 	int id, size;
@@ -2403,7 +2403,7 @@ void __memcg_kmem_uncharge(struct page *page, int order)
 	page->mem_cgroup = NULL;
 	css_put_many(&memcg->css, nr_pages);
 }
-#endif /* CONFIG_MEMCG_KMEM */
+#endif /* !CONFIG_SLOB */
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 
@@ -2839,7 +2839,7 @@ static u64 mem_cgroup_read_u64(struct cgroup_subsys_state *css,
 	}
 }
 
-#ifdef CONFIG_MEMCG_KMEM
+#ifndef CONFIG_SLOB
 static int memcg_online_kmem(struct mem_cgroup *memcg)
 {
 	int err = 0;
@@ -2887,24 +2887,6 @@ out:
 	return err;
 }
 
-static int memcg_update_kmem_limit(struct mem_cgroup *memcg,
-				   unsigned long limit)
-{
-	int ret;
-
-	mutex_lock(&memcg_limit_mutex);
-	/* Top-level cgroup doesn't propagate from root */
-	if (!memcg_kmem_online(memcg)) {
-		ret = memcg_online_kmem(memcg);
-		if (ret)
-			goto out;
-	}
-	ret = page_counter_limit(&memcg->kmem, limit);
-out:
-	mutex_unlock(&memcg_limit_mutex);
-	return ret;
-}
-
 static int memcg_propagate_kmem(struct mem_cgroup *memcg)
 {
 	int ret = 0;
@@ -2979,16 +2961,45 @@ static void memcg_free_kmem(struct mem_cgroup *memcg)
 	}
 }
 #else
+static int memcg_propagate_kmem(struct mem_cgroup *memcg)
+{
+	return 0;
+}
+static void memcg_offline_kmem(struct mem_cgroup *memcg)
+{
+}
+static void memcg_free_kmem(struct mem_cgroup *memcg)
+{
+}
+#endif /* !CONFIG_SLOB */
+
+#ifdef CONFIG_MEMCG_KMEM
 static int memcg_update_kmem_limit(struct mem_cgroup *memcg,
 				   unsigned long limit)
 {
-	return -EINVAL;
+	int ret;
+
+	mutex_lock(&memcg_limit_mutex);
+	/* Top-level cgroup doesn't propagate from root */
+	if (!memcg_kmem_online(memcg)) {
+		ret = memcg_online_kmem(memcg);
+		if (ret)
+			goto out;
+	}
+	ret = page_counter_limit(&memcg->kmem, limit);
+out:
+	mutex_unlock(&memcg_limit_mutex);
+	return ret;
 }
-static void memcg_offline_kmem(struct mem_cgroup *memcg)
+#else
+static int memcg_update_kmem_limit(struct mem_cgroup *memcg,
+				   unsigned long limit)
 {
+	return -EINVAL;
 }
 #endif /* CONFIG_MEMCG_KMEM */
 
+
 /*
  * The user of this function is...
  * RES_LIMIT.
@@ -4160,7 +4171,7 @@ mem_cgroup_css_alloc(struct cgroup_subsys_state *parent_css)
 	vmpressure_init(&memcg->vmpressure);
 	INIT_LIST_HEAD(&memcg->event_list);
 	spin_lock_init(&memcg->event_list_lock);
-#ifdef CONFIG_MEMCG_KMEM
+#ifndef CONFIG_SLOB
 	memcg->kmemcg_id = -1;
 #endif
 #ifdef CONFIG_CGROUP_WRITEBACK
@@ -4222,10 +4233,11 @@ mem_cgroup_css_online(struct cgroup_subsys_state *css)
 	}
 	mutex_unlock(&memcg_create_mutex);
 
-#ifdef CONFIG_MEMCG_KMEM
 	ret = memcg_propagate_kmem(memcg);
 	if (ret)
 		return ret;
+
+#ifdef CONFIG_MEMCG_KMEM
 	ret = tcp_init_cgroup(memcg);
 	if (ret)
 		return ret;
@@ -4279,8 +4291,9 @@ static void mem_cgroup_css_free(struct cgroup_subsys_state *css)
 		static_branch_dec(&memcg_sockets_enabled_key);
 #endif
 
-#ifdef CONFIG_MEMCG_KMEM
 	memcg_free_kmem(memcg);
+
+#ifdef CONFIG_MEMCG_KMEM
 	tcp_destroy_cgroup(memcg);
 #endif
 
diff --git a/mm/slab.h b/mm/slab.h
index c63b869..834ad24 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -173,7 +173,7 @@ ssize_t slabinfo_write(struct file *file, const char __user *buffer,
 void __kmem_cache_free_bulk(struct kmem_cache *, size_t, void **);
 int __kmem_cache_alloc_bulk(struct kmem_cache *, gfp_t, size_t, void **);
 
-#ifdef CONFIG_MEMCG_KMEM
+#if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
 /*
  * Iterate over all memcg caches of the given root cache. The caller must hold
  * slab_mutex.
@@ -251,7 +251,7 @@ static __always_inline int memcg_charge_slab(struct page *page,
 
 extern void slab_init_memcg_params(struct kmem_cache *);
 
-#else /* !CONFIG_MEMCG_KMEM */
+#else /* CONFIG_MEMCG && !CONFIG_SLOB */
 
 #define for_each_memcg_cache(iter, root) \
 	for ((void)(iter), (void)(root); 0; )
@@ -292,7 +292,7 @@ static inline int memcg_charge_slab(struct page *page, gfp_t gfp, int order,
 static inline void slab_init_memcg_params(struct kmem_cache *s)
 {
 }
-#endif /* CONFIG_MEMCG_KMEM */
+#endif /* CONFIG_MEMCG && !CONFIG_SLOB */
 
 static inline struct kmem_cache *cache_from_obj(struct kmem_cache *s, void *x)
 {
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 8c262e6..b50aef0 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -128,7 +128,7 @@ int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t nr,
 	return i;
 }
 
-#ifdef CONFIG_MEMCG_KMEM
+#if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
 void slab_init_memcg_params(struct kmem_cache *s)
 {
 	s->memcg_params.is_root_cache = true;
@@ -221,7 +221,7 @@ static inline int init_memcg_params(struct kmem_cache *s,
 static inline void destroy_memcg_params(struct kmem_cache *s)
 {
 }
-#endif /* CONFIG_MEMCG_KMEM */
+#endif /* CONFIG_MEMCG && !CONFIG_SLOB */
 
 /*
  * Find a mergeable slab cache
@@ -477,7 +477,7 @@ static void release_caches(struct list_head *release, bool need_rcu_barrier)
 	}
 }
 
-#ifdef CONFIG_MEMCG_KMEM
+#if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
 /*
  * memcg_create_kmem_cache - Create a cache for a memory cgroup.
  * @memcg: The memory cgroup the new cache is for.
@@ -689,7 +689,7 @@ static inline int shutdown_memcg_caches(struct kmem_cache *s,
 {
 	return 0;
 }
-#endif /* CONFIG_MEMCG_KMEM */
+#endif /* CONFIG_MEMCG && !CONFIG_SLOB */
 
 void slab_kmem_cache_release(struct kmem_cache *s)
 {
@@ -1123,7 +1123,7 @@ static int slab_show(struct seq_file *m, void *p)
 	return 0;
 }
 
-#ifdef CONFIG_MEMCG_KMEM
+#if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
 int memcg_slab_show(struct seq_file *m, void *p)
 {
 	struct kmem_cache *s = list_entry(p, struct kmem_cache, list);
diff --git a/mm/slub.c b/mm/slub.c
index b21fd24..2e1355a 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -5207,7 +5207,7 @@ static ssize_t slab_attr_store(struct kobject *kobj,
 		return -EIO;
 
 	err = attribute->store(s, buf, len);
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
 	if (slab_state >= FULL && err >= 0 && is_root_cache(s)) {
 		struct kmem_cache *c;
 
@@ -5242,7 +5242,7 @@ static ssize_t slab_attr_store(struct kobject *kobj,
 
 static void memcg_propagate_slab_attrs(struct kmem_cache *s)
 {
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
 	int i;
 	char *buffer = NULL;
 	struct kmem_cache *root_cache;
@@ -5328,7 +5328,7 @@ static struct kset *slab_kset;
 
 static inline struct kset *cache_kset(struct kmem_cache *s)
 {
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
 	if (!is_root_cache(s))
 		return s->memcg_params.root_cache->memcg_kset;
 #endif
@@ -5405,7 +5405,7 @@ static int sysfs_slab_add(struct kmem_cache *s)
 	if (err)
 		goto out_del_kobj;
 
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
 	if (is_root_cache(s)) {
 		s->memcg_kset = kset_create_and_add("cgroup", NULL, &s->kobj);
 		if (!s->memcg_kset) {
@@ -5438,7 +5438,7 @@ void sysfs_slab_remove(struct kmem_cache *s)
 		 */
 		return;
 
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
 	kset_unregister(s->memcg_kset);
 #endif
 	kobject_uevent(&s->kobj, KOBJ_REMOVE);
-- 
2.6.3

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 6/8 v2] mm: memcontrol: move kmem accounting code to CONFIG_MEMCG
@ 2015-12-10 20:22     ` Johannes Weiner
  0 siblings, 0 replies; 79+ messages in thread
From: Johannes Weiner @ 2015-12-10 20:22 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Michal Hocko, Vladimir Davydov, linux-mm, cgroups, linux-kernel,
	kernel-team, Arnd Bergmann

The cgroup2 memory controller will account important in-kernel memory
consumers per default. Move all necessary components to CONFIG_MEMCG.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---

Hi Andrew,

here is a drop-in replacement for what's in your tree to make slob
work with memcg again. It guards all the kmem-specific stuff with
CONFIG_MEMCG && !CONFIG_SLOB. A little lame but I figure it's not
worth introducing another level of indirection and go through the
trouble of finding a more meaningful symbol (CONFIG_MEMCG_KMEM is
taken, CONFIG_MEMCG_SLAB is ambiguous with SLAB vs. SLUB etc.).

So there.

The rest of my series applies fine on top, but if this creates fallout
in subsequent patches you have in your tree I'm happy to send refreshs.

Thanks again for your report, Arnd!

 include/linux/list_lru.h   |  4 +--
 include/linux/memcontrol.h |  7 +++--
 include/linux/sched.h      |  4 +--
 include/linux/slab_def.h   |  3 +-
 include/linux/slub_def.h   |  2 +-
 mm/list_lru.c              | 12 ++++----
 mm/memcontrol.c            | 69 +++++++++++++++++++++++++++-------------------
 mm/slab.h                  |  6 ++--
 mm/slab_common.c           | 10 +++----
 mm/slub.c                  | 10 +++----
 10 files changed, 71 insertions(+), 56 deletions(-)

diff --git a/include/linux/list_lru.h b/include/linux/list_lru.h
index 2a6b994..cb0ba9f 100644
--- a/include/linux/list_lru.h
+++ b/include/linux/list_lru.h
@@ -40,7 +40,7 @@ struct list_lru_node {
 	spinlock_t		lock;
 	/* global list, used for the root cgroup in cgroup aware lrus */
 	struct list_lru_one	lru;
-#ifdef CONFIG_MEMCG_KMEM
+#if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
 	/* for cgroup aware lrus points to per cgroup lists, otherwise NULL */
 	struct list_lru_memcg	*memcg_lrus;
 #endif
@@ -48,7 +48,7 @@ struct list_lru_node {
 
 struct list_lru {
 	struct list_lru_node	*node;
-#ifdef CONFIG_MEMCG_KMEM
+#if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
 	struct list_head	list;
 #endif
 };
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 54dab4d..a87704e 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -236,7 +236,7 @@ struct mem_cgroup {
 #if defined(CONFIG_MEMCG_KMEM) && defined(CONFIG_INET)
 	struct cg_proto tcp_mem;
 #endif
-#if defined(CONFIG_MEMCG_KMEM)
+#ifndef CONFIG_SLOB
         /* Index in the kmem_cache->memcg_params.memcg_caches array */
 	int kmemcg_id;
 	enum memcg_kmem_state kmem_state;
@@ -735,7 +735,7 @@ static inline bool mem_cgroup_under_socket_pressure(struct mem_cgroup *memcg)
 }
 #endif
 
-#ifdef CONFIG_MEMCG_KMEM
+#if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
 extern struct static_key_false memcg_kmem_enabled_key;
 
 extern int memcg_nr_cache_ids;
@@ -891,5 +891,6 @@ memcg_kmem_get_cache(struct kmem_cache *cachep, gfp_t gfp)
 static inline void memcg_kmem_put_cache(struct kmem_cache *cachep)
 {
 }
-#endif /* CONFIG_MEMCG_KMEM */
+#endif /* CONFIG_MEMCG && !CONFIG_SLOB */
+
 #endif /* _LINUX_MEMCONTROL_H */
diff --git a/include/linux/sched.h b/include/linux/sched.h
index edad7a4..ccf957e 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1465,10 +1465,10 @@ struct task_struct {
 	unsigned sched_migrated:1;
 #ifdef CONFIG_MEMCG
 	unsigned memcg_may_oom:1;
-#endif
-#ifdef CONFIG_MEMCG_KMEM
+#ifndef CONFIG_SLOB
 	unsigned memcg_kmem_skip_account:1;
 #endif
+#endif
 #ifdef CONFIG_COMPAT_BRK
 	unsigned brk_randomized:1;
 #endif
diff --git a/include/linux/slab_def.h b/include/linux/slab_def.h
index 33d0490..cf139d3 100644
--- a/include/linux/slab_def.h
+++ b/include/linux/slab_def.h
@@ -69,7 +69,8 @@ struct kmem_cache {
 	 */
 	int obj_offset;
 #endif /* CONFIG_DEBUG_SLAB */
-#ifdef CONFIG_MEMCG_KMEM
+
+#ifdef CONFIG_MEMCG
 	struct memcg_cache_params memcg_params;
 #endif
 
diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h
index 3388511..b7e57927 100644
--- a/include/linux/slub_def.h
+++ b/include/linux/slub_def.h
@@ -84,7 +84,7 @@ struct kmem_cache {
 #ifdef CONFIG_SYSFS
 	struct kobject kobj;	/* For sysfs */
 #endif
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
 	struct memcg_cache_params memcg_params;
 	int max_attr_size; /* for propagation, maximum size of a stored attr */
 #ifdef CONFIG_SYSFS
diff --git a/mm/list_lru.c b/mm/list_lru.c
index afc71ea..1d05cb9 100644
--- a/mm/list_lru.c
+++ b/mm/list_lru.c
@@ -12,7 +12,7 @@
 #include <linux/mutex.h>
 #include <linux/memcontrol.h>
 
-#ifdef CONFIG_MEMCG_KMEM
+#if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
 static LIST_HEAD(list_lrus);
 static DEFINE_MUTEX(list_lrus_mutex);
 
@@ -37,9 +37,9 @@ static void list_lru_register(struct list_lru *lru)
 static void list_lru_unregister(struct list_lru *lru)
 {
 }
-#endif /* CONFIG_MEMCG_KMEM */
+#endif /* CONFIG_MEMCG && !CONFIG_SLOB */
 
-#ifdef CONFIG_MEMCG_KMEM
+#if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
 static inline bool list_lru_memcg_aware(struct list_lru *lru)
 {
 	/*
@@ -104,7 +104,7 @@ list_lru_from_kmem(struct list_lru_node *nlru, void *ptr)
 {
 	return &nlru->lru;
 }
-#endif /* CONFIG_MEMCG_KMEM */
+#endif /* CONFIG_MEMCG && !CONFIG_SLOB */
 
 bool list_lru_add(struct list_lru *lru, struct list_head *item)
 {
@@ -292,7 +292,7 @@ static void init_one_lru(struct list_lru_one *l)
 	l->nr_items = 0;
 }
 
-#ifdef CONFIG_MEMCG_KMEM
+#if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
 static void __memcg_destroy_list_lru_node(struct list_lru_memcg *memcg_lrus,
 					  int begin, int end)
 {
@@ -529,7 +529,7 @@ static int memcg_init_list_lru(struct list_lru *lru, bool memcg_aware)
 static void memcg_destroy_list_lru(struct list_lru *lru)
 {
 }
-#endif /* CONFIG_MEMCG_KMEM */
+#endif /* CONFIG_MEMCG && !CONFIG_SLOB */
 
 int __list_lru_init(struct list_lru *lru, bool memcg_aware,
 		    struct lock_class_key *key)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 55a3f07..70e6fd1 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -297,7 +297,7 @@ static inline struct mem_cgroup *mem_cgroup_from_id(unsigned short id)
 	return mem_cgroup_from_css(css);
 }
 
-#ifdef CONFIG_MEMCG_KMEM
+#ifndef CONFIG_SLOB
 /*
  * This will be the memcg's index in each cache's ->memcg_params.memcg_caches.
  * The main reason for not using cgroup id for this:
@@ -349,7 +349,7 @@ void memcg_put_cache_ids(void)
 DEFINE_STATIC_KEY_FALSE(memcg_kmem_enabled_key);
 EXPORT_SYMBOL(memcg_kmem_enabled_key);
 
-#endif /* CONFIG_MEMCG_KMEM */
+#endif /* !CONFIG_SLOB */
 
 static struct mem_cgroup_per_zone *
 mem_cgroup_zone_zoneinfo(struct mem_cgroup *memcg, struct zone *zone)
@@ -2182,7 +2182,7 @@ static void commit_charge(struct page *page, struct mem_cgroup *memcg,
 		unlock_page_lru(page, isolated);
 }
 
-#ifdef CONFIG_MEMCG_KMEM
+#ifndef CONFIG_SLOB
 static int memcg_alloc_cache_id(void)
 {
 	int id, size;
@@ -2403,7 +2403,7 @@ void __memcg_kmem_uncharge(struct page *page, int order)
 	page->mem_cgroup = NULL;
 	css_put_many(&memcg->css, nr_pages);
 }
-#endif /* CONFIG_MEMCG_KMEM */
+#endif /* !CONFIG_SLOB */
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 
@@ -2839,7 +2839,7 @@ static u64 mem_cgroup_read_u64(struct cgroup_subsys_state *css,
 	}
 }
 
-#ifdef CONFIG_MEMCG_KMEM
+#ifndef CONFIG_SLOB
 static int memcg_online_kmem(struct mem_cgroup *memcg)
 {
 	int err = 0;
@@ -2887,24 +2887,6 @@ out:
 	return err;
 }
 
-static int memcg_update_kmem_limit(struct mem_cgroup *memcg,
-				   unsigned long limit)
-{
-	int ret;
-
-	mutex_lock(&memcg_limit_mutex);
-	/* Top-level cgroup doesn't propagate from root */
-	if (!memcg_kmem_online(memcg)) {
-		ret = memcg_online_kmem(memcg);
-		if (ret)
-			goto out;
-	}
-	ret = page_counter_limit(&memcg->kmem, limit);
-out:
-	mutex_unlock(&memcg_limit_mutex);
-	return ret;
-}
-
 static int memcg_propagate_kmem(struct mem_cgroup *memcg)
 {
 	int ret = 0;
@@ -2979,16 +2961,45 @@ static void memcg_free_kmem(struct mem_cgroup *memcg)
 	}
 }
 #else
+static int memcg_propagate_kmem(struct mem_cgroup *memcg)
+{
+	return 0;
+}
+static void memcg_offline_kmem(struct mem_cgroup *memcg)
+{
+}
+static void memcg_free_kmem(struct mem_cgroup *memcg)
+{
+}
+#endif /* !CONFIG_SLOB */
+
+#ifdef CONFIG_MEMCG_KMEM
 static int memcg_update_kmem_limit(struct mem_cgroup *memcg,
 				   unsigned long limit)
 {
-	return -EINVAL;
+	int ret;
+
+	mutex_lock(&memcg_limit_mutex);
+	/* Top-level cgroup doesn't propagate from root */
+	if (!memcg_kmem_online(memcg)) {
+		ret = memcg_online_kmem(memcg);
+		if (ret)
+			goto out;
+	}
+	ret = page_counter_limit(&memcg->kmem, limit);
+out:
+	mutex_unlock(&memcg_limit_mutex);
+	return ret;
 }
-static void memcg_offline_kmem(struct mem_cgroup *memcg)
+#else
+static int memcg_update_kmem_limit(struct mem_cgroup *memcg,
+				   unsigned long limit)
 {
+	return -EINVAL;
 }
 #endif /* CONFIG_MEMCG_KMEM */
 
+
 /*
  * The user of this function is...
  * RES_LIMIT.
@@ -4160,7 +4171,7 @@ mem_cgroup_css_alloc(struct cgroup_subsys_state *parent_css)
 	vmpressure_init(&memcg->vmpressure);
 	INIT_LIST_HEAD(&memcg->event_list);
 	spin_lock_init(&memcg->event_list_lock);
-#ifdef CONFIG_MEMCG_KMEM
+#ifndef CONFIG_SLOB
 	memcg->kmemcg_id = -1;
 #endif
 #ifdef CONFIG_CGROUP_WRITEBACK
@@ -4222,10 +4233,11 @@ mem_cgroup_css_online(struct cgroup_subsys_state *css)
 	}
 	mutex_unlock(&memcg_create_mutex);
 
-#ifdef CONFIG_MEMCG_KMEM
 	ret = memcg_propagate_kmem(memcg);
 	if (ret)
 		return ret;
+
+#ifdef CONFIG_MEMCG_KMEM
 	ret = tcp_init_cgroup(memcg);
 	if (ret)
 		return ret;
@@ -4279,8 +4291,9 @@ static void mem_cgroup_css_free(struct cgroup_subsys_state *css)
 		static_branch_dec(&memcg_sockets_enabled_key);
 #endif
 
-#ifdef CONFIG_MEMCG_KMEM
 	memcg_free_kmem(memcg);
+
+#ifdef CONFIG_MEMCG_KMEM
 	tcp_destroy_cgroup(memcg);
 #endif
 
diff --git a/mm/slab.h b/mm/slab.h
index c63b869..834ad24 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -173,7 +173,7 @@ ssize_t slabinfo_write(struct file *file, const char __user *buffer,
 void __kmem_cache_free_bulk(struct kmem_cache *, size_t, void **);
 int __kmem_cache_alloc_bulk(struct kmem_cache *, gfp_t, size_t, void **);
 
-#ifdef CONFIG_MEMCG_KMEM
+#if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
 /*
  * Iterate over all memcg caches of the given root cache. The caller must hold
  * slab_mutex.
@@ -251,7 +251,7 @@ static __always_inline int memcg_charge_slab(struct page *page,
 
 extern void slab_init_memcg_params(struct kmem_cache *);
 
-#else /* !CONFIG_MEMCG_KMEM */
+#else /* CONFIG_MEMCG && !CONFIG_SLOB */
 
 #define for_each_memcg_cache(iter, root) \
 	for ((void)(iter), (void)(root); 0; )
@@ -292,7 +292,7 @@ static inline int memcg_charge_slab(struct page *page, gfp_t gfp, int order,
 static inline void slab_init_memcg_params(struct kmem_cache *s)
 {
 }
-#endif /* CONFIG_MEMCG_KMEM */
+#endif /* CONFIG_MEMCG && !CONFIG_SLOB */
 
 static inline struct kmem_cache *cache_from_obj(struct kmem_cache *s, void *x)
 {
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 8c262e6..b50aef0 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -128,7 +128,7 @@ int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t nr,
 	return i;
 }
 
-#ifdef CONFIG_MEMCG_KMEM
+#if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
 void slab_init_memcg_params(struct kmem_cache *s)
 {
 	s->memcg_params.is_root_cache = true;
@@ -221,7 +221,7 @@ static inline int init_memcg_params(struct kmem_cache *s,
 static inline void destroy_memcg_params(struct kmem_cache *s)
 {
 }
-#endif /* CONFIG_MEMCG_KMEM */
+#endif /* CONFIG_MEMCG && !CONFIG_SLOB */
 
 /*
  * Find a mergeable slab cache
@@ -477,7 +477,7 @@ static void release_caches(struct list_head *release, bool need_rcu_barrier)
 	}
 }
 
-#ifdef CONFIG_MEMCG_KMEM
+#if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
 /*
  * memcg_create_kmem_cache - Create a cache for a memory cgroup.
  * @memcg: The memory cgroup the new cache is for.
@@ -689,7 +689,7 @@ static inline int shutdown_memcg_caches(struct kmem_cache *s,
 {
 	return 0;
 }
-#endif /* CONFIG_MEMCG_KMEM */
+#endif /* CONFIG_MEMCG && !CONFIG_SLOB */
 
 void slab_kmem_cache_release(struct kmem_cache *s)
 {
@@ -1123,7 +1123,7 @@ static int slab_show(struct seq_file *m, void *p)
 	return 0;
 }
 
-#ifdef CONFIG_MEMCG_KMEM
+#if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
 int memcg_slab_show(struct seq_file *m, void *p)
 {
 	struct kmem_cache *s = list_entry(p, struct kmem_cache, list);
diff --git a/mm/slub.c b/mm/slub.c
index b21fd24..2e1355a 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -5207,7 +5207,7 @@ static ssize_t slab_attr_store(struct kobject *kobj,
 		return -EIO;
 
 	err = attribute->store(s, buf, len);
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
 	if (slab_state >= FULL && err >= 0 && is_root_cache(s)) {
 		struct kmem_cache *c;
 
@@ -5242,7 +5242,7 @@ static ssize_t slab_attr_store(struct kobject *kobj,
 
 static void memcg_propagate_slab_attrs(struct kmem_cache *s)
 {
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
 	int i;
 	char *buffer = NULL;
 	struct kmem_cache *root_cache;
@@ -5328,7 +5328,7 @@ static struct kset *slab_kset;
 
 static inline struct kset *cache_kset(struct kmem_cache *s)
 {
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
 	if (!is_root_cache(s))
 		return s->memcg_params.root_cache->memcg_kset;
 #endif
@@ -5405,7 +5405,7 @@ static int sysfs_slab_add(struct kmem_cache *s)
 	if (err)
 		goto out_del_kobj;
 
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
 	if (is_root_cache(s)) {
 		s->memcg_kset = kset_create_and_add("cgroup", NULL, &s->kobj);
 		if (!s->memcg_kset) {
@@ -5438,7 +5438,7 @@ void sysfs_slab_remove(struct kmem_cache *s)
 		 */
 		return;
 
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
 	kset_unregister(s->memcg_kset);
 #endif
 	kobject_uevent(&s->kobj, KOBJ_REMOVE);
-- 
2.6.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 6/8 v2] mm: memcontrol: move kmem accounting code to CONFIG_MEMCG
@ 2015-12-10 20:22     ` Johannes Weiner
  0 siblings, 0 replies; 79+ messages in thread
From: Johannes Weiner @ 2015-12-10 20:22 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Michal Hocko, Vladimir Davydov, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, kernel-team-b10kYP2dOMg,
	Arnd Bergmann

The cgroup2 memory controller will account important in-kernel memory
consumers per default. Move all necessary components to CONFIG_MEMCG.

Signed-off-by: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
---

Hi Andrew,

here is a drop-in replacement for what's in your tree to make slob
work with memcg again. It guards all the kmem-specific stuff with
CONFIG_MEMCG && !CONFIG_SLOB. A little lame but I figure it's not
worth introducing another level of indirection and go through the
trouble of finding a more meaningful symbol (CONFIG_MEMCG_KMEM is
taken, CONFIG_MEMCG_SLAB is ambiguous with SLAB vs. SLUB etc.).

So there.

The rest of my series applies fine on top, but if this creates fallout
in subsequent patches you have in your tree I'm happy to send refreshs.

Thanks again for your report, Arnd!

 include/linux/list_lru.h   |  4 +--
 include/linux/memcontrol.h |  7 +++--
 include/linux/sched.h      |  4 +--
 include/linux/slab_def.h   |  3 +-
 include/linux/slub_def.h   |  2 +-
 mm/list_lru.c              | 12 ++++----
 mm/memcontrol.c            | 69 +++++++++++++++++++++++++++-------------------
 mm/slab.h                  |  6 ++--
 mm/slab_common.c           | 10 +++----
 mm/slub.c                  | 10 +++----
 10 files changed, 71 insertions(+), 56 deletions(-)

diff --git a/include/linux/list_lru.h b/include/linux/list_lru.h
index 2a6b994..cb0ba9f 100644
--- a/include/linux/list_lru.h
+++ b/include/linux/list_lru.h
@@ -40,7 +40,7 @@ struct list_lru_node {
 	spinlock_t		lock;
 	/* global list, used for the root cgroup in cgroup aware lrus */
 	struct list_lru_one	lru;
-#ifdef CONFIG_MEMCG_KMEM
+#if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
 	/* for cgroup aware lrus points to per cgroup lists, otherwise NULL */
 	struct list_lru_memcg	*memcg_lrus;
 #endif
@@ -48,7 +48,7 @@ struct list_lru_node {
 
 struct list_lru {
 	struct list_lru_node	*node;
-#ifdef CONFIG_MEMCG_KMEM
+#if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
 	struct list_head	list;
 #endif
 };
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 54dab4d..a87704e 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -236,7 +236,7 @@ struct mem_cgroup {
 #if defined(CONFIG_MEMCG_KMEM) && defined(CONFIG_INET)
 	struct cg_proto tcp_mem;
 #endif
-#if defined(CONFIG_MEMCG_KMEM)
+#ifndef CONFIG_SLOB
         /* Index in the kmem_cache->memcg_params.memcg_caches array */
 	int kmemcg_id;
 	enum memcg_kmem_state kmem_state;
@@ -735,7 +735,7 @@ static inline bool mem_cgroup_under_socket_pressure(struct mem_cgroup *memcg)
 }
 #endif
 
-#ifdef CONFIG_MEMCG_KMEM
+#if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
 extern struct static_key_false memcg_kmem_enabled_key;
 
 extern int memcg_nr_cache_ids;
@@ -891,5 +891,6 @@ memcg_kmem_get_cache(struct kmem_cache *cachep, gfp_t gfp)
 static inline void memcg_kmem_put_cache(struct kmem_cache *cachep)
 {
 }
-#endif /* CONFIG_MEMCG_KMEM */
+#endif /* CONFIG_MEMCG && !CONFIG_SLOB */
+
 #endif /* _LINUX_MEMCONTROL_H */
diff --git a/include/linux/sched.h b/include/linux/sched.h
index edad7a4..ccf957e 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1465,10 +1465,10 @@ struct task_struct {
 	unsigned sched_migrated:1;
 #ifdef CONFIG_MEMCG
 	unsigned memcg_may_oom:1;
-#endif
-#ifdef CONFIG_MEMCG_KMEM
+#ifndef CONFIG_SLOB
 	unsigned memcg_kmem_skip_account:1;
 #endif
+#endif
 #ifdef CONFIG_COMPAT_BRK
 	unsigned brk_randomized:1;
 #endif
diff --git a/include/linux/slab_def.h b/include/linux/slab_def.h
index 33d0490..cf139d3 100644
--- a/include/linux/slab_def.h
+++ b/include/linux/slab_def.h
@@ -69,7 +69,8 @@ struct kmem_cache {
 	 */
 	int obj_offset;
 #endif /* CONFIG_DEBUG_SLAB */
-#ifdef CONFIG_MEMCG_KMEM
+
+#ifdef CONFIG_MEMCG
 	struct memcg_cache_params memcg_params;
 #endif
 
diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h
index 3388511..b7e57927 100644
--- a/include/linux/slub_def.h
+++ b/include/linux/slub_def.h
@@ -84,7 +84,7 @@ struct kmem_cache {
 #ifdef CONFIG_SYSFS
 	struct kobject kobj;	/* For sysfs */
 #endif
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
 	struct memcg_cache_params memcg_params;
 	int max_attr_size; /* for propagation, maximum size of a stored attr */
 #ifdef CONFIG_SYSFS
diff --git a/mm/list_lru.c b/mm/list_lru.c
index afc71ea..1d05cb9 100644
--- a/mm/list_lru.c
+++ b/mm/list_lru.c
@@ -12,7 +12,7 @@
 #include <linux/mutex.h>
 #include <linux/memcontrol.h>
 
-#ifdef CONFIG_MEMCG_KMEM
+#if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
 static LIST_HEAD(list_lrus);
 static DEFINE_MUTEX(list_lrus_mutex);
 
@@ -37,9 +37,9 @@ static void list_lru_register(struct list_lru *lru)
 static void list_lru_unregister(struct list_lru *lru)
 {
 }
-#endif /* CONFIG_MEMCG_KMEM */
+#endif /* CONFIG_MEMCG && !CONFIG_SLOB */
 
-#ifdef CONFIG_MEMCG_KMEM
+#if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
 static inline bool list_lru_memcg_aware(struct list_lru *lru)
 {
 	/*
@@ -104,7 +104,7 @@ list_lru_from_kmem(struct list_lru_node *nlru, void *ptr)
 {
 	return &nlru->lru;
 }
-#endif /* CONFIG_MEMCG_KMEM */
+#endif /* CONFIG_MEMCG && !CONFIG_SLOB */
 
 bool list_lru_add(struct list_lru *lru, struct list_head *item)
 {
@@ -292,7 +292,7 @@ static void init_one_lru(struct list_lru_one *l)
 	l->nr_items = 0;
 }
 
-#ifdef CONFIG_MEMCG_KMEM
+#if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
 static void __memcg_destroy_list_lru_node(struct list_lru_memcg *memcg_lrus,
 					  int begin, int end)
 {
@@ -529,7 +529,7 @@ static int memcg_init_list_lru(struct list_lru *lru, bool memcg_aware)
 static void memcg_destroy_list_lru(struct list_lru *lru)
 {
 }
-#endif /* CONFIG_MEMCG_KMEM */
+#endif /* CONFIG_MEMCG && !CONFIG_SLOB */
 
 int __list_lru_init(struct list_lru *lru, bool memcg_aware,
 		    struct lock_class_key *key)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 55a3f07..70e6fd1 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -297,7 +297,7 @@ static inline struct mem_cgroup *mem_cgroup_from_id(unsigned short id)
 	return mem_cgroup_from_css(css);
 }
 
-#ifdef CONFIG_MEMCG_KMEM
+#ifndef CONFIG_SLOB
 /*
  * This will be the memcg's index in each cache's ->memcg_params.memcg_caches.
  * The main reason for not using cgroup id for this:
@@ -349,7 +349,7 @@ void memcg_put_cache_ids(void)
 DEFINE_STATIC_KEY_FALSE(memcg_kmem_enabled_key);
 EXPORT_SYMBOL(memcg_kmem_enabled_key);
 
-#endif /* CONFIG_MEMCG_KMEM */
+#endif /* !CONFIG_SLOB */
 
 static struct mem_cgroup_per_zone *
 mem_cgroup_zone_zoneinfo(struct mem_cgroup *memcg, struct zone *zone)
@@ -2182,7 +2182,7 @@ static void commit_charge(struct page *page, struct mem_cgroup *memcg,
 		unlock_page_lru(page, isolated);
 }
 
-#ifdef CONFIG_MEMCG_KMEM
+#ifndef CONFIG_SLOB
 static int memcg_alloc_cache_id(void)
 {
 	int id, size;
@@ -2403,7 +2403,7 @@ void __memcg_kmem_uncharge(struct page *page, int order)
 	page->mem_cgroup = NULL;
 	css_put_many(&memcg->css, nr_pages);
 }
-#endif /* CONFIG_MEMCG_KMEM */
+#endif /* !CONFIG_SLOB */
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 
@@ -2839,7 +2839,7 @@ static u64 mem_cgroup_read_u64(struct cgroup_subsys_state *css,
 	}
 }
 
-#ifdef CONFIG_MEMCG_KMEM
+#ifndef CONFIG_SLOB
 static int memcg_online_kmem(struct mem_cgroup *memcg)
 {
 	int err = 0;
@@ -2887,24 +2887,6 @@ out:
 	return err;
 }
 
-static int memcg_update_kmem_limit(struct mem_cgroup *memcg,
-				   unsigned long limit)
-{
-	int ret;
-
-	mutex_lock(&memcg_limit_mutex);
-	/* Top-level cgroup doesn't propagate from root */
-	if (!memcg_kmem_online(memcg)) {
-		ret = memcg_online_kmem(memcg);
-		if (ret)
-			goto out;
-	}
-	ret = page_counter_limit(&memcg->kmem, limit);
-out:
-	mutex_unlock(&memcg_limit_mutex);
-	return ret;
-}
-
 static int memcg_propagate_kmem(struct mem_cgroup *memcg)
 {
 	int ret = 0;
@@ -2979,16 +2961,45 @@ static void memcg_free_kmem(struct mem_cgroup *memcg)
 	}
 }
 #else
+static int memcg_propagate_kmem(struct mem_cgroup *memcg)
+{
+	return 0;
+}
+static void memcg_offline_kmem(struct mem_cgroup *memcg)
+{
+}
+static void memcg_free_kmem(struct mem_cgroup *memcg)
+{
+}
+#endif /* !CONFIG_SLOB */
+
+#ifdef CONFIG_MEMCG_KMEM
 static int memcg_update_kmem_limit(struct mem_cgroup *memcg,
 				   unsigned long limit)
 {
-	return -EINVAL;
+	int ret;
+
+	mutex_lock(&memcg_limit_mutex);
+	/* Top-level cgroup doesn't propagate from root */
+	if (!memcg_kmem_online(memcg)) {
+		ret = memcg_online_kmem(memcg);
+		if (ret)
+			goto out;
+	}
+	ret = page_counter_limit(&memcg->kmem, limit);
+out:
+	mutex_unlock(&memcg_limit_mutex);
+	return ret;
 }
-static void memcg_offline_kmem(struct mem_cgroup *memcg)
+#else
+static int memcg_update_kmem_limit(struct mem_cgroup *memcg,
+				   unsigned long limit)
 {
+	return -EINVAL;
 }
 #endif /* CONFIG_MEMCG_KMEM */
 
+
 /*
  * The user of this function is...
  * RES_LIMIT.
@@ -4160,7 +4171,7 @@ mem_cgroup_css_alloc(struct cgroup_subsys_state *parent_css)
 	vmpressure_init(&memcg->vmpressure);
 	INIT_LIST_HEAD(&memcg->event_list);
 	spin_lock_init(&memcg->event_list_lock);
-#ifdef CONFIG_MEMCG_KMEM
+#ifndef CONFIG_SLOB
 	memcg->kmemcg_id = -1;
 #endif
 #ifdef CONFIG_CGROUP_WRITEBACK
@@ -4222,10 +4233,11 @@ mem_cgroup_css_online(struct cgroup_subsys_state *css)
 	}
 	mutex_unlock(&memcg_create_mutex);
 
-#ifdef CONFIG_MEMCG_KMEM
 	ret = memcg_propagate_kmem(memcg);
 	if (ret)
 		return ret;
+
+#ifdef CONFIG_MEMCG_KMEM
 	ret = tcp_init_cgroup(memcg);
 	if (ret)
 		return ret;
@@ -4279,8 +4291,9 @@ static void mem_cgroup_css_free(struct cgroup_subsys_state *css)
 		static_branch_dec(&memcg_sockets_enabled_key);
 #endif
 
-#ifdef CONFIG_MEMCG_KMEM
 	memcg_free_kmem(memcg);
+
+#ifdef CONFIG_MEMCG_KMEM
 	tcp_destroy_cgroup(memcg);
 #endif
 
diff --git a/mm/slab.h b/mm/slab.h
index c63b869..834ad24 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -173,7 +173,7 @@ ssize_t slabinfo_write(struct file *file, const char __user *buffer,
 void __kmem_cache_free_bulk(struct kmem_cache *, size_t, void **);
 int __kmem_cache_alloc_bulk(struct kmem_cache *, gfp_t, size_t, void **);
 
-#ifdef CONFIG_MEMCG_KMEM
+#if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
 /*
  * Iterate over all memcg caches of the given root cache. The caller must hold
  * slab_mutex.
@@ -251,7 +251,7 @@ static __always_inline int memcg_charge_slab(struct page *page,
 
 extern void slab_init_memcg_params(struct kmem_cache *);
 
-#else /* !CONFIG_MEMCG_KMEM */
+#else /* CONFIG_MEMCG && !CONFIG_SLOB */
 
 #define for_each_memcg_cache(iter, root) \
 	for ((void)(iter), (void)(root); 0; )
@@ -292,7 +292,7 @@ static inline int memcg_charge_slab(struct page *page, gfp_t gfp, int order,
 static inline void slab_init_memcg_params(struct kmem_cache *s)
 {
 }
-#endif /* CONFIG_MEMCG_KMEM */
+#endif /* CONFIG_MEMCG && !CONFIG_SLOB */
 
 static inline struct kmem_cache *cache_from_obj(struct kmem_cache *s, void *x)
 {
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 8c262e6..b50aef0 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -128,7 +128,7 @@ int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t nr,
 	return i;
 }
 
-#ifdef CONFIG_MEMCG_KMEM
+#if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
 void slab_init_memcg_params(struct kmem_cache *s)
 {
 	s->memcg_params.is_root_cache = true;
@@ -221,7 +221,7 @@ static inline int init_memcg_params(struct kmem_cache *s,
 static inline void destroy_memcg_params(struct kmem_cache *s)
 {
 }
-#endif /* CONFIG_MEMCG_KMEM */
+#endif /* CONFIG_MEMCG && !CONFIG_SLOB */
 
 /*
  * Find a mergeable slab cache
@@ -477,7 +477,7 @@ static void release_caches(struct list_head *release, bool need_rcu_barrier)
 	}
 }
 
-#ifdef CONFIG_MEMCG_KMEM
+#if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
 /*
  * memcg_create_kmem_cache - Create a cache for a memory cgroup.
  * @memcg: The memory cgroup the new cache is for.
@@ -689,7 +689,7 @@ static inline int shutdown_memcg_caches(struct kmem_cache *s,
 {
 	return 0;
 }
-#endif /* CONFIG_MEMCG_KMEM */
+#endif /* CONFIG_MEMCG && !CONFIG_SLOB */
 
 void slab_kmem_cache_release(struct kmem_cache *s)
 {
@@ -1123,7 +1123,7 @@ static int slab_show(struct seq_file *m, void *p)
 	return 0;
 }
 
-#ifdef CONFIG_MEMCG_KMEM
+#if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
 int memcg_slab_show(struct seq_file *m, void *p)
 {
 	struct kmem_cache *s = list_entry(p, struct kmem_cache, list);
diff --git a/mm/slub.c b/mm/slub.c
index b21fd24..2e1355a 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -5207,7 +5207,7 @@ static ssize_t slab_attr_store(struct kobject *kobj,
 		return -EIO;
 
 	err = attribute->store(s, buf, len);
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
 	if (slab_state >= FULL && err >= 0 && is_root_cache(s)) {
 		struct kmem_cache *c;
 
@@ -5242,7 +5242,7 @@ static ssize_t slab_attr_store(struct kobject *kobj,
 
 static void memcg_propagate_slab_attrs(struct kmem_cache *s)
 {
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
 	int i;
 	char *buffer = NULL;
 	struct kmem_cache *root_cache;
@@ -5328,7 +5328,7 @@ static struct kset *slab_kset;
 
 static inline struct kset *cache_kset(struct kmem_cache *s)
 {
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
 	if (!is_root_cache(s))
 		return s->memcg_params.root_cache->memcg_kset;
 #endif
@@ -5405,7 +5405,7 @@ static int sysfs_slab_add(struct kmem_cache *s)
 	if (err)
 		goto out_del_kobj;
 
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
 	if (is_root_cache(s)) {
 		s->memcg_kset = kset_create_and_add("cgroup", NULL, &s->kobj);
 		if (!s->memcg_kset) {
@@ -5438,7 +5438,7 @@ void sysfs_slab_remove(struct kmem_cache *s)
 		 */
 		return;
 
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
 	kset_unregister(s->memcg_kset);
 #endif
 	kobject_uevent(&s->kobj, KOBJ_REMOVE);
-- 
2.6.3

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* Re: [PATCH 6/8 v2] mm: memcontrol: move kmem accounting code to CONFIG_MEMCG
  2015-12-10 20:22     ` Johannes Weiner
  (?)
@ 2015-12-10 20:50       ` Johannes Weiner
  -1 siblings, 0 replies; 79+ messages in thread
From: Johannes Weiner @ 2015-12-10 20:50 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Michal Hocko, Vladimir Davydov, linux-mm, cgroups, linux-kernel,
	kernel-team, Arnd Bergmann

Narf. Almost there...

>From db4522b2b3e6ca8ce5f6e673948772bcd8fdd298 Mon Sep 17 00:00:00 2001
From: Johannes Weiner <hannes@cmpxchg.org>
Date: Thu, 10 Dec 2015 15:42:54 -0500
Subject: [PATCH] mm: memcontrol: move kmem accounting code to CONFIG_MEMCG fix

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 include/linux/slab.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 3ffee74..3627d5c 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -86,7 +86,7 @@
 #else
 # define SLAB_FAILSLAB		0x00000000UL
 #endif
-#ifdef CONFIG_MEMCG_KMEM
+#if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
 # define SLAB_ACCOUNT		0x04000000UL	/* Account to memcg */
 #else
 # define SLAB_ACCOUNT		0x00000000UL
-- 
2.6.3

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* Re: [PATCH 6/8 v2] mm: memcontrol: move kmem accounting code to CONFIG_MEMCG
@ 2015-12-10 20:50       ` Johannes Weiner
  0 siblings, 0 replies; 79+ messages in thread
From: Johannes Weiner @ 2015-12-10 20:50 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Michal Hocko, Vladimir Davydov, linux-mm, cgroups, linux-kernel,
	kernel-team, Arnd Bergmann

Narf. Almost there...

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 6/8 v2] mm: memcontrol: move kmem accounting code to CONFIG_MEMCG
@ 2015-12-10 20:50       ` Johannes Weiner
  0 siblings, 0 replies; 79+ messages in thread
From: Johannes Weiner @ 2015-12-10 20:50 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Michal Hocko, Vladimir Davydov, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, kernel-team-b10kYP2dOMg,
	Arnd Bergmann

Narf. Almost there...

From db4522b2b3e6ca8ce5f6e673948772bcd8fdd298 Mon Sep 17 00:00:00 2001
From: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
Date: Thu, 10 Dec 2015 15:42:54 -0500
Subject: [PATCH] mm: memcontrol: move kmem accounting code to CONFIG_MEMCG fix

Signed-off-by: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
---
 include/linux/slab.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 3ffee74..3627d5c 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -86,7 +86,7 @@
 #else
 # define SLAB_FAILSLAB		0x00000000UL
 #endif
-#ifdef CONFIG_MEMCG_KMEM
+#if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
 # define SLAB_ACCOUNT		0x04000000UL	/* Account to memcg */
 #else
 # define SLAB_ACCOUNT		0x00000000UL
-- 
2.6.3

^ permalink raw reply related	[flat|nested] 79+ messages in thread

end of thread, other threads:[~2015-12-10 20:50 UTC | newest]

Thread overview: 79+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-08 18:34 [PATCH 0/8] mm: memcontrol: account "kmem" in cgroup2 Johannes Weiner
2015-12-08 18:34 ` Johannes Weiner
2015-12-08 18:34 ` Johannes Weiner
2015-12-08 18:34 ` [PATCH 1/8] mm: memcontrol: drop unused @css argument in memcg_init_kmem Johannes Weiner
2015-12-08 18:34   ` Johannes Weiner
2015-12-09  9:01   ` Vladimir Davydov
2015-12-09  9:01     ` Vladimir Davydov
2015-12-09  9:01     ` Vladimir Davydov
2015-12-10 12:37   ` Michal Hocko
2015-12-10 12:37     ` Michal Hocko
2015-12-10 12:37     ` Michal Hocko
2015-12-08 18:34 ` [PATCH 2/8] mm: memcontrol: remove double kmem page_counter init Johannes Weiner
2015-12-08 18:34   ` Johannes Weiner
2015-12-09  9:05   ` Vladimir Davydov
2015-12-09  9:05     ` Vladimir Davydov
2015-12-10 12:40   ` Michal Hocko
2015-12-10 12:40     ` Michal Hocko
2015-12-10 12:40     ` Michal Hocko
2015-12-08 18:34 ` [PATCH 3/8] mm: memcontrol: give the kmem states more descriptive names Johannes Weiner
2015-12-08 18:34   ` Johannes Weiner
2015-12-09  9:10   ` Vladimir Davydov
2015-12-09  9:10     ` Vladimir Davydov
2015-12-09  9:10     ` Vladimir Davydov
2015-12-10 12:47   ` Michal Hocko
2015-12-10 12:47     ` Michal Hocko
2015-12-08 18:34 ` [PATCH 4/8] mm: memcontrol: group kmem init and exit functions together Johannes Weiner
2015-12-08 18:34   ` Johannes Weiner
2015-12-09  9:14   ` Vladimir Davydov
2015-12-09  9:14     ` Vladimir Davydov
2015-12-09  9:14     ` Vladimir Davydov
2015-12-10 12:56   ` Michal Hocko
2015-12-10 12:56     ` Michal Hocko
2015-12-08 18:34 ` [PATCH 5/8] mm: memcontrol: separate kmem code from legacy tcp accounting code Johannes Weiner
2015-12-08 18:34   ` Johannes Weiner
2015-12-09  9:23   ` Vladimir Davydov
2015-12-09  9:23     ` Vladimir Davydov
2015-12-09  9:23     ` Vladimir Davydov
2015-12-10 12:59   ` Michal Hocko
2015-12-10 12:59     ` Michal Hocko
2015-12-10 12:59     ` Michal Hocko
2015-12-08 18:34 ` [PATCH 6/8] mm: memcontrol: move kmem accounting code to CONFIG_MEMCG Johannes Weiner
2015-12-08 18:34   ` Johannes Weiner
2015-12-09  9:32   ` Vladimir Davydov
2015-12-09  9:32     ` Vladimir Davydov
2015-12-09  9:32     ` Vladimir Davydov
2015-12-10 13:17   ` Michal Hocko
2015-12-10 13:17     ` Michal Hocko
2015-12-10 14:00     ` Johannes Weiner
2015-12-10 14:00       ` Johannes Weiner
2015-12-10 14:00       ` Johannes Weiner
2015-12-10 20:22   ` [PATCH 6/8 v2] " Johannes Weiner
2015-12-10 20:22     ` Johannes Weiner
2015-12-10 20:22     ` Johannes Weiner
2015-12-10 20:50     ` Johannes Weiner
2015-12-10 20:50       ` Johannes Weiner
2015-12-10 20:50       ` Johannes Weiner
2015-12-08 18:34 ` [PATCH 7/8] mm: memcontrol: account "kmem" consumers in cgroup2 memory controller Johannes Weiner
2015-12-08 18:34   ` Johannes Weiner
2015-12-09 11:30   ` Vladimir Davydov
2015-12-09 11:30     ` Vladimir Davydov
2015-12-09 11:30     ` Vladimir Davydov
2015-12-09 14:32     ` Johannes Weiner
2015-12-09 14:32       ` Johannes Weiner
2015-12-09 14:32       ` Johannes Weiner
2015-12-10 13:28     ` Michal Hocko
2015-12-10 13:28       ` Michal Hocko
2015-12-10 13:28       ` Michal Hocko
2015-12-10 15:16       ` Johannes Weiner
2015-12-10 15:16         ` Johannes Weiner
2015-12-10 16:25         ` Michal Hocko
2015-12-10 16:25           ` Michal Hocko
2015-12-10 16:25           ` Michal Hocko
2015-12-10 14:21   ` Michal Hocko
2015-12-10 14:21     ` Michal Hocko
2015-12-08 18:34 ` [PATCH 8/8] mm: memcontrol: introduce CONFIG_MEMCG_LEGACY_KMEM Johannes Weiner
2015-12-08 18:34   ` Johannes Weiner
2015-12-09 11:31   ` Vladimir Davydov
2015-12-09 11:31     ` Vladimir Davydov
2015-12-09 11:31     ` Vladimir Davydov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.