All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] memcg: reduce the size of struct memcg 244-fold.
@ 2013-01-24  6:46 ` Glauber Costa
  0 siblings, 0 replies; 20+ messages in thread
From: Glauber Costa @ 2013-01-24  6:46 UTC (permalink / raw)
  To: linux-mm
  Cc: cgroups, Glauber Costa, Michal Hocko, Kamezawa Hiroyuki,
	Johannes Weiner, Greg Thelen, Hugh Dickins, Ying Han, Mel Gorman,
	Rik van Riel

In order to maintain all the memcg bookkeeping, we need per-node
descriptors, which will in turn contain a per-zone descriptor.

Because we want to statically allocate those, this array ends up being
very big. Part of the reason is that we allocate something large enough
to hold MAX_NUMNODES, the compile time constant that holds the maximum
number of nodes we would ever consider.

However, we can do better in some cases if the firmware help us. This is
true for modern x86 machines; coincidentally one of the architectures in
which MAX_NUMNODES tends to be very big.

By using the firmware-provided maximum number of nodes instead of
MAX_NUMNODES, we can reduce the memory footprint of struct memcg
considerably. In the extreme case in which we have only one node, this
reduces the size of the structure from ~ 64k to ~2k. This is
particularly important because it means that we will no longer resort to
the vmalloc area for the struct memcg on defconfigs. We also have enough
room for an extra node and still be outside vmalloc.

One also has to keep in mind that with the industry's ability to fit
more processors in a die as fast as the FED prints money, a nodes = 2
configuration is already respectably big.

[ v2: use size_t for size calculations ]
Signed-off-by: Glauber Costa <glommer@parallels.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Greg Thelen <gthelen@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ying Han <yinghan@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
---
 mm/memcontrol.c | 40 +++++++++++++++++++++++++---------------
 1 file changed, 25 insertions(+), 15 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 09255ec..09d8b02 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -172,7 +172,7 @@ struct mem_cgroup_per_node {
 };
 
 struct mem_cgroup_lru_info {
-	struct mem_cgroup_per_node *nodeinfo[MAX_NUMNODES];
+	struct mem_cgroup_per_node *nodeinfo[0];
 };
 
 /*
@@ -276,17 +276,6 @@ struct mem_cgroup {
 	 */
 	struct res_counter kmem;
 	/*
-	 * Per cgroup active and inactive list, similar to the
-	 * per zone LRU lists.
-	 */
-	struct mem_cgroup_lru_info info;
-	int last_scanned_node;
-#if MAX_NUMNODES > 1
-	nodemask_t	scan_nodes;
-	atomic_t	numainfo_events;
-	atomic_t	numainfo_updating;
-#endif
-	/*
 	 * Should the accounting and control be hierarchical, per subtree?
 	 */
 	bool use_hierarchy;
@@ -349,8 +338,29 @@ struct mem_cgroup {
         /* Index in the kmem_cache->memcg_params->memcg_caches array */
 	int kmemcg_id;
 #endif
+
+	int last_scanned_node;
+#if MAX_NUMNODES > 1
+	nodemask_t	scan_nodes;
+	atomic_t	numainfo_events;
+	atomic_t	numainfo_updating;
+#endif
+	/*
+	 * Per cgroup active and inactive list, similar to the
+	 * per zone LRU lists.
+	 *
+	 * WARNING: This has to be the last element of the struct. Don't
+	 * add new fields after this point.
+	 */
+	struct mem_cgroup_lru_info info;
 };
 
+static inline size_t memcg_size(void)
+{
+	return sizeof(struct mem_cgroup) +
+		nr_node_ids * sizeof(struct mem_cgroup_per_node);
+}
+
 /* internal only representation about the status of kmem accounting. */
 enum {
 	KMEM_ACCOUNTED_ACTIVE = 0, /* accounted by this cgroup itself */
@@ -5894,9 +5904,9 @@ static void free_mem_cgroup_per_zone_info(struct mem_cgroup *memcg, int node)
 static struct mem_cgroup *mem_cgroup_alloc(void)
 {
 	struct mem_cgroup *memcg;
-	int size = sizeof(struct mem_cgroup);
+	size_t size = memcg_size();
 
-	/* Can be very big if MAX_NUMNODES is very big */
+	/* Can be very big if nr_node_ids is very big */
 	if (size < PAGE_SIZE)
 		memcg = kzalloc(size, GFP_KERNEL);
 	else
@@ -5933,7 +5943,7 @@ out_free:
 static void __mem_cgroup_free(struct mem_cgroup *memcg)
 {
 	int node;
-	int size = sizeof(struct mem_cgroup);
+	size_t size = memcg_size();
 
 	mem_cgroup_remove_from_trees(memcg);
 	free_css_id(&mem_cgroup_subsys, &memcg->css);
-- 
1.8.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v2] memcg: reduce the size of struct memcg 244-fold.
@ 2013-01-24  6:46 ` Glauber Costa
  0 siblings, 0 replies; 20+ messages in thread
From: Glauber Costa @ 2013-01-24  6:46 UTC (permalink / raw)
  To: linux-mm-Bw31MaZKKs3YtjvyW6yDsg
  Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, Glauber Costa, Michal Hocko,
	Kamezawa Hiroyuki, Johannes Weiner, Greg Thelen, Hugh Dickins,
	Ying Han, Mel Gorman, Rik van Riel

In order to maintain all the memcg bookkeeping, we need per-node
descriptors, which will in turn contain a per-zone descriptor.

Because we want to statically allocate those, this array ends up being
very big. Part of the reason is that we allocate something large enough
to hold MAX_NUMNODES, the compile time constant that holds the maximum
number of nodes we would ever consider.

However, we can do better in some cases if the firmware help us. This is
true for modern x86 machines; coincidentally one of the architectures in
which MAX_NUMNODES tends to be very big.

By using the firmware-provided maximum number of nodes instead of
MAX_NUMNODES, we can reduce the memory footprint of struct memcg
considerably. In the extreme case in which we have only one node, this
reduces the size of the structure from ~ 64k to ~2k. This is
particularly important because it means that we will no longer resort to
the vmalloc area for the struct memcg on defconfigs. We also have enough
room for an extra node and still be outside vmalloc.

One also has to keep in mind that with the industry's ability to fit
more processors in a die as fast as the FED prints money, a nodes = 2
configuration is already respectably big.

[ v2: use size_t for size calculations ]
Signed-off-by: Glauber Costa <glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
Cc: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
Cc: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
Cc: Greg Thelen <gthelen-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
Cc: Hugh Dickins <hughd-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
Cc: Ying Han <yinghan-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
Cc: Mel Gorman <mgorman-l3A5Bk7waGM@public.gmane.org>
Cc: Rik van Riel <riel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
 mm/memcontrol.c | 40 +++++++++++++++++++++++++---------------
 1 file changed, 25 insertions(+), 15 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 09255ec..09d8b02 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -172,7 +172,7 @@ struct mem_cgroup_per_node {
 };
 
 struct mem_cgroup_lru_info {
-	struct mem_cgroup_per_node *nodeinfo[MAX_NUMNODES];
+	struct mem_cgroup_per_node *nodeinfo[0];
 };
 
 /*
@@ -276,17 +276,6 @@ struct mem_cgroup {
 	 */
 	struct res_counter kmem;
 	/*
-	 * Per cgroup active and inactive list, similar to the
-	 * per zone LRU lists.
-	 */
-	struct mem_cgroup_lru_info info;
-	int last_scanned_node;
-#if MAX_NUMNODES > 1
-	nodemask_t	scan_nodes;
-	atomic_t	numainfo_events;
-	atomic_t	numainfo_updating;
-#endif
-	/*
 	 * Should the accounting and control be hierarchical, per subtree?
 	 */
 	bool use_hierarchy;
@@ -349,8 +338,29 @@ struct mem_cgroup {
         /* Index in the kmem_cache->memcg_params->memcg_caches array */
 	int kmemcg_id;
 #endif
+
+	int last_scanned_node;
+#if MAX_NUMNODES > 1
+	nodemask_t	scan_nodes;
+	atomic_t	numainfo_events;
+	atomic_t	numainfo_updating;
+#endif
+	/*
+	 * Per cgroup active and inactive list, similar to the
+	 * per zone LRU lists.
+	 *
+	 * WARNING: This has to be the last element of the struct. Don't
+	 * add new fields after this point.
+	 */
+	struct mem_cgroup_lru_info info;
 };
 
+static inline size_t memcg_size(void)
+{
+	return sizeof(struct mem_cgroup) +
+		nr_node_ids * sizeof(struct mem_cgroup_per_node);
+}
+
 /* internal only representation about the status of kmem accounting. */
 enum {
 	KMEM_ACCOUNTED_ACTIVE = 0, /* accounted by this cgroup itself */
@@ -5894,9 +5904,9 @@ static void free_mem_cgroup_per_zone_info(struct mem_cgroup *memcg, int node)
 static struct mem_cgroup *mem_cgroup_alloc(void)
 {
 	struct mem_cgroup *memcg;
-	int size = sizeof(struct mem_cgroup);
+	size_t size = memcg_size();
 
-	/* Can be very big if MAX_NUMNODES is very big */
+	/* Can be very big if nr_node_ids is very big */
 	if (size < PAGE_SIZE)
 		memcg = kzalloc(size, GFP_KERNEL);
 	else
@@ -5933,7 +5943,7 @@ out_free:
 static void __mem_cgroup_free(struct mem_cgroup *memcg)
 {
 	int node;
-	int size = sizeof(struct mem_cgroup);
+	size_t size = memcg_size();
 
 	mem_cgroup_remove_from_trees(memcg);
 	free_css_id(&mem_cgroup_subsys, &memcg->css);
-- 
1.8.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v2] memcg: reduce the size of struct memcg 244-fold.
  2013-01-24  6:46 ` Glauber Costa
  (?)
@ 2013-01-24  7:50 ` Greg Thelen
  2013-01-24  7:52     ` Glauber Costa
  2013-01-24 23:51     ` Andrew Morton
  -1 siblings, 2 replies; 20+ messages in thread
From: Greg Thelen @ 2013-01-24  7:50 UTC (permalink / raw)
  To: Glauber Costa
  Cc: linux-mm, cgroups, Michal Hocko, Kamezawa Hiroyuki,
	Johannes Weiner, Hugh Dickins, Ying Han, Mel Gorman,
	Rik van Riel

On Wed, Jan 23 2013, Glauber Costa wrote:

> In order to maintain all the memcg bookkeeping, we need per-node
> descriptors, which will in turn contain a per-zone descriptor.
>
> Because we want to statically allocate those, this array ends up being
> very big. Part of the reason is that we allocate something large enough
> to hold MAX_NUMNODES, the compile time constant that holds the maximum
> number of nodes we would ever consider.
>
> However, we can do better in some cases if the firmware help us. This is
> true for modern x86 machines; coincidentally one of the architectures in
> which MAX_NUMNODES tends to be very big.
>
> By using the firmware-provided maximum number of nodes instead of
> MAX_NUMNODES, we can reduce the memory footprint of struct memcg
> considerably. In the extreme case in which we have only one node, this
> reduces the size of the structure from ~ 64k to ~2k. This is
> particularly important because it means that we will no longer resort to
> the vmalloc area for the struct memcg on defconfigs. We also have enough
> room for an extra node and still be outside vmalloc.
>
> One also has to keep in mind that with the industry's ability to fit
> more processors in a die as fast as the FED prints money, a nodes = 2
> configuration is already respectably big.
>
> [ v2: use size_t for size calculations ]
> Signed-off-by: Glauber Costa <glommer@parallels.com>
> Cc: Michal Hocko <mhocko@suse.cz>
> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Greg Thelen <gthelen@google.com>
> Cc: Hugh Dickins <hughd@google.com>
> Cc: Ying Han <yinghan@google.com>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Rik van Riel <riel@redhat.com>

Reviewed-by: Greg Thelen <gthelen@google.com>

> ---
>  mm/memcontrol.c | 40 +++++++++++++++++++++++++---------------
>  1 file changed, 25 insertions(+), 15 deletions(-)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 09255ec..09d8b02 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -172,7 +172,7 @@ struct mem_cgroup_per_node {
>  };
>  
>  struct mem_cgroup_lru_info {
> -	struct mem_cgroup_per_node *nodeinfo[MAX_NUMNODES];
> +	struct mem_cgroup_per_node *nodeinfo[0];

It seems like a VM_BUG_ON() in mem_cgroup_zoneinfo() asserting that the
nid index is less than nr_node_ids would be good at catching illegal
indexes.  I don't see any illegal indexes in your patch, but I fear that
someday a MAX_NUMNODES based for() loop might sneak in.

>  };
>  
>  /*
> @@ -276,17 +276,6 @@ struct mem_cgroup {
>  	 */
>  	struct res_counter kmem;
>  	/*
> -	 * Per cgroup active and inactive list, similar to the
> -	 * per zone LRU lists.
> -	 */
> -	struct mem_cgroup_lru_info info;
> -	int last_scanned_node;
> -#if MAX_NUMNODES > 1
> -	nodemask_t	scan_nodes;
> -	atomic_t	numainfo_events;
> -	atomic_t	numainfo_updating;
> -#endif
> -	/*
>  	 * Should the accounting and control be hierarchical, per subtree?
>  	 */
>  	bool use_hierarchy;
> @@ -349,8 +338,29 @@ struct mem_cgroup {
>          /* Index in the kmem_cache->memcg_params->memcg_caches array */
>  	int kmemcg_id;
>  #endif
> +
> +	int last_scanned_node;
> +#if MAX_NUMNODES > 1
> +	nodemask_t	scan_nodes;
> +	atomic_t	numainfo_events;
> +	atomic_t	numainfo_updating;
> +#endif
> +	/*
> +	 * Per cgroup active and inactive list, similar to the
> +	 * per zone LRU lists.
> +	 *
> +	 * WARNING: This has to be the last element of the struct. Don't
> +	 * add new fields after this point.
> +	 */
> +	struct mem_cgroup_lru_info info;
>  };
>  
> +static inline size_t memcg_size(void)
> +{
> +	return sizeof(struct mem_cgroup) +
> +		nr_node_ids * sizeof(struct mem_cgroup_per_node);
> +}
> +

Tangential question: why use inline here?  I figure that modern
compilers are good at making inlining decisions.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2] memcg: reduce the size of struct memcg 244-fold.
@ 2013-01-24  7:52     ` Glauber Costa
  0 siblings, 0 replies; 20+ messages in thread
From: Glauber Costa @ 2013-01-24  7:52 UTC (permalink / raw)
  To: Greg Thelen
  Cc: linux-mm, cgroups, Michal Hocko, Kamezawa Hiroyuki,
	Johannes Weiner, Hugh Dickins, Ying Han, Mel Gorman,
	Rik van Riel


>> +static inline size_t memcg_size(void)
>> +{
>> +	return sizeof(struct mem_cgroup) +
>> +		nr_node_ids * sizeof(struct mem_cgroup_per_node);
>> +}
>> +
> 
> Tangential question: why use inline here?  I figure that modern
> compilers are good at making inlining decisions.
I was born last century.

No reason, really. Just habit.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2] memcg: reduce the size of struct memcg 244-fold.
@ 2013-01-24  7:52     ` Glauber Costa
  0 siblings, 0 replies; 20+ messages in thread
From: Glauber Costa @ 2013-01-24  7:52 UTC (permalink / raw)
  To: Greg Thelen
  Cc: linux-mm-Bw31MaZKKs3YtjvyW6yDsg, cgroups-u79uwXL29TY76Z2rM5mHXA,
	Michal Hocko, Kamezawa Hiroyuki, Johannes Weiner, Hugh Dickins,
	Ying Han, Mel Gorman, Rik van Riel


>> +static inline size_t memcg_size(void)
>> +{
>> +	return sizeof(struct mem_cgroup) +
>> +		nr_node_ids * sizeof(struct mem_cgroup_per_node);
>> +}
>> +
> 
> Tangential question: why use inline here?  I figure that modern
> compilers are good at making inlining decisions.
I was born last century.

No reason, really. Just habit.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2] memcg: reduce the size of struct memcg 244-fold.
@ 2013-01-24 10:14   ` Michal Hocko
  0 siblings, 0 replies; 20+ messages in thread
From: Michal Hocko @ 2013-01-24 10:14 UTC (permalink / raw)
  To: Glauber Costa
  Cc: linux-mm, cgroups, Kamezawa Hiroyuki, Johannes Weiner,
	Greg Thelen, Hugh Dickins, Ying Han, Mel Gorman, Rik van Riel

On Thu 24-01-13 10:46:35, Glauber Costa wrote:
> In order to maintain all the memcg bookkeeping, we need per-node
> descriptors, which will in turn contain a per-zone descriptor.
> 
> Because we want to statically allocate those, this array ends up being
> very big. Part of the reason is that we allocate something large enough
> to hold MAX_NUMNODES, the compile time constant that holds the maximum
> number of nodes we would ever consider.
> 
> However, we can do better in some cases if the firmware help us. This is
> true for modern x86 machines; coincidentally one of the architectures in
> which MAX_NUMNODES tends to be very big.
> 
> By using the firmware-provided maximum number of nodes instead of
> MAX_NUMNODES, we can reduce the memory footprint of struct memcg
> considerably. In the extreme case in which we have only one node, this
> reduces the size of the structure from ~ 64k to ~2k. This is
> particularly important because it means that we will no longer resort to
> the vmalloc area for the struct memcg on defconfigs. We also have enough
> room for an extra node and still be outside vmalloc.
> 
> One also has to keep in mind that with the industry's ability to fit
> more processors in a die as fast as the FED prints money, a nodes = 2
> configuration is already respectably big.
> 
> [ v2: use size_t for size calculations ]
> Signed-off-by: Glauber Costa <glommer@parallels.com>
> Cc: Michal Hocko <mhocko@suse.cz>
> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Greg Thelen <gthelen@google.com>
> Cc: Hugh Dickins <hughd@google.com>
> Cc: Ying Han <yinghan@google.com>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Rik van Riel <riel@redhat.com>

The patch seem fine and I guess other things would need to be revisited
as well if nr_node_ids could change after NUMA initialization code.

Acked-by: Michal Hocko <mhocko@suse.cz>

> ---
>  mm/memcontrol.c | 40 +++++++++++++++++++++++++---------------
>  1 file changed, 25 insertions(+), 15 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 09255ec..09d8b02 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -172,7 +172,7 @@ struct mem_cgroup_per_node {
>  };
>  
>  struct mem_cgroup_lru_info {
> -	struct mem_cgroup_per_node *nodeinfo[MAX_NUMNODES];
> +	struct mem_cgroup_per_node *nodeinfo[0];
>  };
>  
>  /*
> @@ -276,17 +276,6 @@ struct mem_cgroup {
>  	 */
>  	struct res_counter kmem;
>  	/*
> -	 * Per cgroup active and inactive list, similar to the
> -	 * per zone LRU lists.
> -	 */
> -	struct mem_cgroup_lru_info info;
> -	int last_scanned_node;
> -#if MAX_NUMNODES > 1
> -	nodemask_t	scan_nodes;
> -	atomic_t	numainfo_events;
> -	atomic_t	numainfo_updating;
> -#endif
> -	/*
>  	 * Should the accounting and control be hierarchical, per subtree?
>  	 */
>  	bool use_hierarchy;
> @@ -349,8 +338,29 @@ struct mem_cgroup {
>          /* Index in the kmem_cache->memcg_params->memcg_caches array */
>  	int kmemcg_id;
>  #endif
> +
> +	int last_scanned_node;
> +#if MAX_NUMNODES > 1
> +	nodemask_t	scan_nodes;
> +	atomic_t	numainfo_events;
> +	atomic_t	numainfo_updating;
> +#endif
> +	/*
> +	 * Per cgroup active and inactive list, similar to the
> +	 * per zone LRU lists.
> +	 *
> +	 * WARNING: This has to be the last element of the struct. Don't
> +	 * add new fields after this point.
> +	 */
> +	struct mem_cgroup_lru_info info;
>  };
>  
> +static inline size_t memcg_size(void)
> +{
> +	return sizeof(struct mem_cgroup) +
> +		nr_node_ids * sizeof(struct mem_cgroup_per_node);
> +}
> +
>  /* internal only representation about the status of kmem accounting. */
>  enum {
>  	KMEM_ACCOUNTED_ACTIVE = 0, /* accounted by this cgroup itself */
> @@ -5894,9 +5904,9 @@ static void free_mem_cgroup_per_zone_info(struct mem_cgroup *memcg, int node)
>  static struct mem_cgroup *mem_cgroup_alloc(void)
>  {
>  	struct mem_cgroup *memcg;
> -	int size = sizeof(struct mem_cgroup);
> +	size_t size = memcg_size();
>  
> -	/* Can be very big if MAX_NUMNODES is very big */
> +	/* Can be very big if nr_node_ids is very big */
>  	if (size < PAGE_SIZE)
>  		memcg = kzalloc(size, GFP_KERNEL);
>  	else
> @@ -5933,7 +5943,7 @@ out_free:
>  static void __mem_cgroup_free(struct mem_cgroup *memcg)
>  {
>  	int node;
> -	int size = sizeof(struct mem_cgroup);
> +	size_t size = memcg_size();
>  
>  	mem_cgroup_remove_from_trees(memcg);
>  	free_css_id(&mem_cgroup_subsys, &memcg->css);
> -- 
> 1.8.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe cgroups" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2] memcg: reduce the size of struct memcg 244-fold.
@ 2013-01-24 10:14   ` Michal Hocko
  0 siblings, 0 replies; 20+ messages in thread
From: Michal Hocko @ 2013-01-24 10:14 UTC (permalink / raw)
  To: Glauber Costa
  Cc: linux-mm-Bw31MaZKKs3YtjvyW6yDsg, cgroups-u79uwXL29TY76Z2rM5mHXA,
	Kamezawa Hiroyuki, Johannes Weiner, Greg Thelen, Hugh Dickins,
	Ying Han, Mel Gorman, Rik van Riel

On Thu 24-01-13 10:46:35, Glauber Costa wrote:
> In order to maintain all the memcg bookkeeping, we need per-node
> descriptors, which will in turn contain a per-zone descriptor.
> 
> Because we want to statically allocate those, this array ends up being
> very big. Part of the reason is that we allocate something large enough
> to hold MAX_NUMNODES, the compile time constant that holds the maximum
> number of nodes we would ever consider.
> 
> However, we can do better in some cases if the firmware help us. This is
> true for modern x86 machines; coincidentally one of the architectures in
> which MAX_NUMNODES tends to be very big.
> 
> By using the firmware-provided maximum number of nodes instead of
> MAX_NUMNODES, we can reduce the memory footprint of struct memcg
> considerably. In the extreme case in which we have only one node, this
> reduces the size of the structure from ~ 64k to ~2k. This is
> particularly important because it means that we will no longer resort to
> the vmalloc area for the struct memcg on defconfigs. We also have enough
> room for an extra node and still be outside vmalloc.
> 
> One also has to keep in mind that with the industry's ability to fit
> more processors in a die as fast as the FED prints money, a nodes = 2
> configuration is already respectably big.
> 
> [ v2: use size_t for size calculations ]
> Signed-off-by: Glauber Costa <glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
> Cc: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org>
> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
> Cc: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
> Cc: Greg Thelen <gthelen-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> Cc: Hugh Dickins <hughd-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> Cc: Ying Han <yinghan-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> Cc: Mel Gorman <mgorman-l3A5Bk7waGM@public.gmane.org>
> Cc: Rik van Riel <riel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

The patch seem fine and I guess other things would need to be revisited
as well if nr_node_ids could change after NUMA initialization code.

Acked-by: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org>

> ---
>  mm/memcontrol.c | 40 +++++++++++++++++++++++++---------------
>  1 file changed, 25 insertions(+), 15 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 09255ec..09d8b02 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -172,7 +172,7 @@ struct mem_cgroup_per_node {
>  };
>  
>  struct mem_cgroup_lru_info {
> -	struct mem_cgroup_per_node *nodeinfo[MAX_NUMNODES];
> +	struct mem_cgroup_per_node *nodeinfo[0];
>  };
>  
>  /*
> @@ -276,17 +276,6 @@ struct mem_cgroup {
>  	 */
>  	struct res_counter kmem;
>  	/*
> -	 * Per cgroup active and inactive list, similar to the
> -	 * per zone LRU lists.
> -	 */
> -	struct mem_cgroup_lru_info info;
> -	int last_scanned_node;
> -#if MAX_NUMNODES > 1
> -	nodemask_t	scan_nodes;
> -	atomic_t	numainfo_events;
> -	atomic_t	numainfo_updating;
> -#endif
> -	/*
>  	 * Should the accounting and control be hierarchical, per subtree?
>  	 */
>  	bool use_hierarchy;
> @@ -349,8 +338,29 @@ struct mem_cgroup {
>          /* Index in the kmem_cache->memcg_params->memcg_caches array */
>  	int kmemcg_id;
>  #endif
> +
> +	int last_scanned_node;
> +#if MAX_NUMNODES > 1
> +	nodemask_t	scan_nodes;
> +	atomic_t	numainfo_events;
> +	atomic_t	numainfo_updating;
> +#endif
> +	/*
> +	 * Per cgroup active and inactive list, similar to the
> +	 * per zone LRU lists.
> +	 *
> +	 * WARNING: This has to be the last element of the struct. Don't
> +	 * add new fields after this point.
> +	 */
> +	struct mem_cgroup_lru_info info;
>  };
>  
> +static inline size_t memcg_size(void)
> +{
> +	return sizeof(struct mem_cgroup) +
> +		nr_node_ids * sizeof(struct mem_cgroup_per_node);
> +}
> +
>  /* internal only representation about the status of kmem accounting. */
>  enum {
>  	KMEM_ACCOUNTED_ACTIVE = 0, /* accounted by this cgroup itself */
> @@ -5894,9 +5904,9 @@ static void free_mem_cgroup_per_zone_info(struct mem_cgroup *memcg, int node)
>  static struct mem_cgroup *mem_cgroup_alloc(void)
>  {
>  	struct mem_cgroup *memcg;
> -	int size = sizeof(struct mem_cgroup);
> +	size_t size = memcg_size();
>  
> -	/* Can be very big if MAX_NUMNODES is very big */
> +	/* Can be very big if nr_node_ids is very big */
>  	if (size < PAGE_SIZE)
>  		memcg = kzalloc(size, GFP_KERNEL);
>  	else
> @@ -5933,7 +5943,7 @@ out_free:
>  static void __mem_cgroup_free(struct mem_cgroup *memcg)
>  {
>  	int node;
> -	int size = sizeof(struct mem_cgroup);
> +	size_t size = memcg_size();
>  
>  	mem_cgroup_remove_from_trees(memcg);
>  	free_css_id(&mem_cgroup_subsys, &memcg->css);
> -- 
> 1.8.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe cgroups" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2] memcg: reduce the size of struct memcg 244-fold.
@ 2013-01-24 23:51     ` Andrew Morton
  0 siblings, 0 replies; 20+ messages in thread
From: Andrew Morton @ 2013-01-24 23:51 UTC (permalink / raw)
  To: Greg Thelen
  Cc: Glauber Costa, linux-mm, cgroups, Michal Hocko,
	Kamezawa Hiroyuki, Johannes Weiner, Hugh Dickins, Ying Han,
	Mel Gorman, Rik van Riel

On Wed, 23 Jan 2013 23:50:31 -0800
Greg Thelen <gthelen@google.com> wrote:

> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -172,7 +172,7 @@ struct mem_cgroup_per_node {
> >  };
> >  
> >  struct mem_cgroup_lru_info {
> > -	struct mem_cgroup_per_node *nodeinfo[MAX_NUMNODES];
> > +	struct mem_cgroup_per_node *nodeinfo[0];
> 
> It seems like a VM_BUG_ON() in mem_cgroup_zoneinfo() asserting that the
> nid index is less than nr_node_ids would be good at catching illegal
> indexes.  I don't see any illegal indexes in your patch, but I fear that
> someday a MAX_NUMNODES based for() loop might sneak in.

Can't hurt ;)

> Tangential question: why use inline here?  I figure that modern
> compilers are good at making inlining decisions.

And that'll save some disk space.

This?

--- a/mm/memcontrol.c~memcg-reduce-the-size-of-struct-memcg-244-fold-fix
+++ a/mm/memcontrol.c
@@ -381,7 +381,7 @@ enum {
 		((1 << KMEM_ACCOUNTED_ACTIVE) | (1 << KMEM_ACCOUNTED_ACTIVATED))
 
 #ifdef CONFIG_MEMCG_KMEM
-static inline void memcg_kmem_set_active(struct mem_cgroup *memcg)
+static void memcg_kmem_set_active(struct mem_cgroup *memcg)
 {
 	set_bit(KMEM_ACCOUNTED_ACTIVE, &memcg->kmem_account_flags);
 }
@@ -645,6 +645,7 @@ static void drain_all_stock_async(struct
 static struct mem_cgroup_per_zone *
 mem_cgroup_zoneinfo(struct mem_cgroup *memcg, int nid, int zid)
 {
+	VM_BUG_ON((unsigned)nid >= nr_node_ids);
 	return &memcg->info.nodeinfo[nid]->zoneinfo[zid];
 }
 
_

Glauber, could you please cc me on patches more often?  It's a bit of a
pita having to go back to the mailing list to see if there has been
more dicussion and I may end up missing late review comments and acks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2] memcg: reduce the size of struct memcg 244-fold.
@ 2013-01-24 23:51     ` Andrew Morton
  0 siblings, 0 replies; 20+ messages in thread
From: Andrew Morton @ 2013-01-24 23:51 UTC (permalink / raw)
  To: Greg Thelen
  Cc: Glauber Costa, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	cgroups-u79uwXL29TY76Z2rM5mHXA, Michal Hocko, Kamezawa Hiroyuki,
	Johannes Weiner, Hugh Dickins, Ying Han, Mel Gorman,
	Rik van Riel

On Wed, 23 Jan 2013 23:50:31 -0800
Greg Thelen <gthelen-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote:

> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -172,7 +172,7 @@ struct mem_cgroup_per_node {
> >  };
> >  
> >  struct mem_cgroup_lru_info {
> > -	struct mem_cgroup_per_node *nodeinfo[MAX_NUMNODES];
> > +	struct mem_cgroup_per_node *nodeinfo[0];
> 
> It seems like a VM_BUG_ON() in mem_cgroup_zoneinfo() asserting that the
> nid index is less than nr_node_ids would be good at catching illegal
> indexes.  I don't see any illegal indexes in your patch, but I fear that
> someday a MAX_NUMNODES based for() loop might sneak in.

Can't hurt ;)

> Tangential question: why use inline here?  I figure that modern
> compilers are good at making inlining decisions.

And that'll save some disk space.

This?

--- a/mm/memcontrol.c~memcg-reduce-the-size-of-struct-memcg-244-fold-fix
+++ a/mm/memcontrol.c
@@ -381,7 +381,7 @@ enum {
 		((1 << KMEM_ACCOUNTED_ACTIVE) | (1 << KMEM_ACCOUNTED_ACTIVATED))
 
 #ifdef CONFIG_MEMCG_KMEM
-static inline void memcg_kmem_set_active(struct mem_cgroup *memcg)
+static void memcg_kmem_set_active(struct mem_cgroup *memcg)
 {
 	set_bit(KMEM_ACCOUNTED_ACTIVE, &memcg->kmem_account_flags);
 }
@@ -645,6 +645,7 @@ static void drain_all_stock_async(struct
 static struct mem_cgroup_per_zone *
 mem_cgroup_zoneinfo(struct mem_cgroup *memcg, int nid, int zid)
 {
+	VM_BUG_ON((unsigned)nid >= nr_node_ids);
 	return &memcg->info.nodeinfo[nid]->zoneinfo[zid];
 }
 
_

Glauber, could you please cc me on patches more often?  It's a bit of a
pita having to go back to the mailing list to see if there has been
more dicussion and I may end up missing late review comments and acks.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2] memcg: reduce the size of struct memcg 244-fold.
  2013-01-24 23:51     ` Andrew Morton
  (?)
@ 2013-01-25  7:37     ` Lord Glauber Costa of Sealand
  -1 siblings, 0 replies; 20+ messages in thread
From: Lord Glauber Costa of Sealand @ 2013-01-25  7:37 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Greg Thelen, linux-mm, cgroups, Michal Hocko, Kamezawa Hiroyuki,
	Johannes Weiner, Hugh Dickins, Ying Han, Mel Gorman,
	Rik van Riel


>  #ifdef CONFIG_MEMCG_KMEM
> -static inline void memcg_kmem_set_active(struct mem_cgroup *memcg)
> +static void memcg_kmem_set_active(struct mem_cgroup *memcg)
>  {
>  	set_bit(KMEM_ACCOUNTED_ACTIVE, &memcg->kmem_account_flags);
>  }
> @@ -645,6 +645,7 @@ static void drain_all_stock_async(struct
>  static struct mem_cgroup_per_zone *
>  mem_cgroup_zoneinfo(struct mem_cgroup *memcg, int nid, int zid)
>  {
> +	VM_BUG_ON((unsigned)nid >= nr_node_ids);
>  	return &memcg->info.nodeinfo[nid]->zoneinfo[zid];
>  }
>  
> _
> 
> Glauber, could you please cc me on patches more often?  It's a bit of a
> pita having to go back to the mailing list to see if there has been
> more dicussion and I may end up missing late review comments and acks.
> 
Sure, absolutely.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2] memcg: reduce the size of struct memcg 244-fold.
  2013-01-24 23:51     ` Andrew Morton
  (?)
  (?)
@ 2013-01-25 17:14     ` Greg Thelen
  -1 siblings, 0 replies; 20+ messages in thread
From: Greg Thelen @ 2013-01-25 17:14 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Glauber Costa, linux-mm, cgroups, Michal Hocko,
	Kamezawa Hiroyuki, Johannes Weiner, Hugh Dickins, Ying Han,
	Mel Gorman, Rik van Riel

On Thu, Jan 24 2013, Andrew Morton wrote:

> On Wed, 23 Jan 2013 23:50:31 -0800
> Greg Thelen <gthelen@google.com> wrote:
>
>> > --- a/mm/memcontrol.c
>> > +++ b/mm/memcontrol.c
>> > @@ -172,7 +172,7 @@ struct mem_cgroup_per_node {
>> >  };
>> >  
>> >  struct mem_cgroup_lru_info {
>> > -	struct mem_cgroup_per_node *nodeinfo[MAX_NUMNODES];
>> > +	struct mem_cgroup_per_node *nodeinfo[0];
>> 
>> It seems like a VM_BUG_ON() in mem_cgroup_zoneinfo() asserting that the
>> nid index is less than nr_node_ids would be good at catching illegal
>> indexes.  I don't see any illegal indexes in your patch, but I fear that
>> someday a MAX_NUMNODES based for() loop might sneak in.
>
> Can't hurt ;)
>
>> Tangential question: why use inline here?  I figure that modern
>> compilers are good at making inlining decisions.
>
> And that'll save some disk space.
>
> This?

Yup, that looks good to me.

Acked-by: Greg Thelen <gthelen@google.com>

> --- a/mm/memcontrol.c~memcg-reduce-the-size-of-struct-memcg-244-fold-fix
> +++ a/mm/memcontrol.c
> @@ -381,7 +381,7 @@ enum {
>  		((1 << KMEM_ACCOUNTED_ACTIVE) | (1 << KMEM_ACCOUNTED_ACTIVATED))
>  
>  #ifdef CONFIG_MEMCG_KMEM
> -static inline void memcg_kmem_set_active(struct mem_cgroup *memcg)
> +static void memcg_kmem_set_active(struct mem_cgroup *memcg)
>  {
>  	set_bit(KMEM_ACCOUNTED_ACTIVE, &memcg->kmem_account_flags);
>  }
> @@ -645,6 +645,7 @@ static void drain_all_stock_async(struct
>  static struct mem_cgroup_per_zone *
>  mem_cgroup_zoneinfo(struct mem_cgroup *memcg, int nid, int zid)
>  {
> +	VM_BUG_ON((unsigned)nid >= nr_node_ids);
>  	return &memcg->info.nodeinfo[nid]->zoneinfo[zid];
>  }
>  
> _
>
> Glauber, could you please cc me on patches more often?  It's a bit of a
> pita having to go back to the mailing list to see if there has been
> more dicussion and I may end up missing late review comments and acks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2] memcg: reduce the size of struct memcg 244-fold.
@ 2013-01-29  0:08   ` Kamezawa Hiroyuki
  0 siblings, 0 replies; 20+ messages in thread
From: Kamezawa Hiroyuki @ 2013-01-29  0:08 UTC (permalink / raw)
  To: Glauber Costa
  Cc: linux-mm, cgroups, Michal Hocko, Johannes Weiner, Greg Thelen,
	Hugh Dickins, Ying Han, Mel Gorman, Rik van Riel

(2013/01/24 15:46), Glauber Costa wrote:
> In order to maintain all the memcg bookkeeping, we need per-node
> descriptors, which will in turn contain a per-zone descriptor.
> 
> Because we want to statically allocate those, this array ends up being
> very big. Part of the reason is that we allocate something large enough
> to hold MAX_NUMNODES, the compile time constant that holds the maximum
> number of nodes we would ever consider.
> 
> However, we can do better in some cases if the firmware help us. This is
> true for modern x86 machines; coincidentally one of the architectures in
> which MAX_NUMNODES tends to be very big.
> 
> By using the firmware-provided maximum number of nodes instead of
> MAX_NUMNODES, we can reduce the memory footprint of struct memcg
> considerably. In the extreme case in which we have only one node, this
> reduces the size of the structure from ~ 64k to ~2k. This is
> particularly important because it means that we will no longer resort to
> the vmalloc area for the struct memcg on defconfigs. We also have enough
> room for an extra node and still be outside vmalloc.
> 
> One also has to keep in mind that with the industry's ability to fit
> more processors in a die as fast as the FED prints money, a nodes = 2
> configuration is already respectably big.
> 
> [ v2: use size_t for size calculations ]
> Signed-off-by: Glauber Costa <glommer@parallels.com>
> Cc: Michal Hocko <mhocko@suse.cz>
> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Greg Thelen <gthelen@google.com>
> Cc: Hugh Dickins <hughd@google.com>
> Cc: Ying Han <yinghan@google.com>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Rik van Riel <riel@redhat.com>
> ---
>   mm/memcontrol.c | 40 +++++++++++++++++++++++++---------------
>   1 file changed, 25 insertions(+), 15 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 09255ec..09d8b02 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -172,7 +172,7 @@ struct mem_cgroup_per_node {
>   };
>   
>   struct mem_cgroup_lru_info {
> -	struct mem_cgroup_per_node *nodeinfo[MAX_NUMNODES];
> +	struct mem_cgroup_per_node *nodeinfo[0];
>   };
>   
>   /*
> @@ -276,17 +276,6 @@ struct mem_cgroup {
>   	 */
>   	struct res_counter kmem;
>   	/*
> -	 * Per cgroup active and inactive list, similar to the
> -	 * per zone LRU lists.
> -	 */
> -	struct mem_cgroup_lru_info info;
> -	int last_scanned_node;
> -#if MAX_NUMNODES > 1
> -	nodemask_t	scan_nodes;
> -	atomic_t	numainfo_events;
> -	atomic_t	numainfo_updating;
> -#endif
> -	/*
>   	 * Should the accounting and control be hierarchical, per subtree?
>   	 */
>   	bool use_hierarchy;
> @@ -349,8 +338,29 @@ struct mem_cgroup {
>           /* Index in the kmem_cache->memcg_params->memcg_caches array */
>   	int kmemcg_id;
>   #endif
> +
> +	int last_scanned_node;
> +#if MAX_NUMNODES > 1
> +	nodemask_t	scan_nodes;
> +	atomic_t	numainfo_events;
> +	atomic_t	numainfo_updating;
> +#endif
> +	/*
> +	 * Per cgroup active and inactive list, similar to the
> +	 * per zone LRU lists.
> +	 *
> +	 * WARNING: This has to be the last element of the struct. Don't
> +	 * add new fields after this point.
> +	 */
> +	struct mem_cgroup_lru_info info;
>   };
>   
> +static inline size_t memcg_size(void)
> +{
> +	return sizeof(struct mem_cgroup) +
> +		nr_node_ids * sizeof(struct mem_cgroup_per_node);
> +}
> 
ok, nr_node_ids is made from possible_node_map.

Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>




--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2] memcg: reduce the size of struct memcg 244-fold.
@ 2013-01-29  0:08   ` Kamezawa Hiroyuki
  0 siblings, 0 replies; 20+ messages in thread
From: Kamezawa Hiroyuki @ 2013-01-29  0:08 UTC (permalink / raw)
  To: Glauber Costa
  Cc: linux-mm-Bw31MaZKKs3YtjvyW6yDsg, cgroups-u79uwXL29TY76Z2rM5mHXA,
	Michal Hocko, Johannes Weiner, Greg Thelen, Hugh Dickins,
	Ying Han, Mel Gorman, Rik van Riel

(2013/01/24 15:46), Glauber Costa wrote:
> In order to maintain all the memcg bookkeeping, we need per-node
> descriptors, which will in turn contain a per-zone descriptor.
> 
> Because we want to statically allocate those, this array ends up being
> very big. Part of the reason is that we allocate something large enough
> to hold MAX_NUMNODES, the compile time constant that holds the maximum
> number of nodes we would ever consider.
> 
> However, we can do better in some cases if the firmware help us. This is
> true for modern x86 machines; coincidentally one of the architectures in
> which MAX_NUMNODES tends to be very big.
> 
> By using the firmware-provided maximum number of nodes instead of
> MAX_NUMNODES, we can reduce the memory footprint of struct memcg
> considerably. In the extreme case in which we have only one node, this
> reduces the size of the structure from ~ 64k to ~2k. This is
> particularly important because it means that we will no longer resort to
> the vmalloc area for the struct memcg on defconfigs. We also have enough
> room for an extra node and still be outside vmalloc.
> 
> One also has to keep in mind that with the industry's ability to fit
> more processors in a die as fast as the FED prints money, a nodes = 2
> configuration is already respectably big.
> 
> [ v2: use size_t for size calculations ]
> Signed-off-by: Glauber Costa <glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
> Cc: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org>
> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
> Cc: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
> Cc: Greg Thelen <gthelen-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> Cc: Hugh Dickins <hughd-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> Cc: Ying Han <yinghan-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> Cc: Mel Gorman <mgorman-l3A5Bk7waGM@public.gmane.org>
> Cc: Rik van Riel <riel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> ---
>   mm/memcontrol.c | 40 +++++++++++++++++++++++++---------------
>   1 file changed, 25 insertions(+), 15 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 09255ec..09d8b02 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -172,7 +172,7 @@ struct mem_cgroup_per_node {
>   };
>   
>   struct mem_cgroup_lru_info {
> -	struct mem_cgroup_per_node *nodeinfo[MAX_NUMNODES];
> +	struct mem_cgroup_per_node *nodeinfo[0];
>   };
>   
>   /*
> @@ -276,17 +276,6 @@ struct mem_cgroup {
>   	 */
>   	struct res_counter kmem;
>   	/*
> -	 * Per cgroup active and inactive list, similar to the
> -	 * per zone LRU lists.
> -	 */
> -	struct mem_cgroup_lru_info info;
> -	int last_scanned_node;
> -#if MAX_NUMNODES > 1
> -	nodemask_t	scan_nodes;
> -	atomic_t	numainfo_events;
> -	atomic_t	numainfo_updating;
> -#endif
> -	/*
>   	 * Should the accounting and control be hierarchical, per subtree?
>   	 */
>   	bool use_hierarchy;
> @@ -349,8 +338,29 @@ struct mem_cgroup {
>           /* Index in the kmem_cache->memcg_params->memcg_caches array */
>   	int kmemcg_id;
>   #endif
> +
> +	int last_scanned_node;
> +#if MAX_NUMNODES > 1
> +	nodemask_t	scan_nodes;
> +	atomic_t	numainfo_events;
> +	atomic_t	numainfo_updating;
> +#endif
> +	/*
> +	 * Per cgroup active and inactive list, similar to the
> +	 * per zone LRU lists.
> +	 *
> +	 * WARNING: This has to be the last element of the struct. Don't
> +	 * add new fields after this point.
> +	 */
> +	struct mem_cgroup_lru_info info;
>   };
>   
> +static inline size_t memcg_size(void)
> +{
> +	return sizeof(struct mem_cgroup) +
> +		nr_node_ids * sizeof(struct mem_cgroup_per_node);
> +}
> 
ok, nr_node_ids is made from possible_node_map.

Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>




^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2] memcg: reduce the size of struct memcg 244-fold.
  2013-01-24 23:51     ` Andrew Morton
                       ` (2 preceding siblings ...)
  (?)
@ 2013-02-05 18:37     ` Johannes Weiner
  -1 siblings, 0 replies; 20+ messages in thread
From: Johannes Weiner @ 2013-02-05 18:37 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Greg Thelen, Glauber Costa, linux-mm, cgroups, Michal Hocko,
	Kamezawa Hiroyuki, Hugh Dickins, Ying Han, Mel Gorman,
	Rik van Riel

On Thu, Jan 24, 2013 at 03:51:05PM -0800, Andrew Morton wrote:
> On Wed, 23 Jan 2013 23:50:31 -0800
> Greg Thelen <gthelen@google.com> wrote:
> 
> > > --- a/mm/memcontrol.c
> > > +++ b/mm/memcontrol.c
> > > @@ -172,7 +172,7 @@ struct mem_cgroup_per_node {
> > >  };
> > >  
> > >  struct mem_cgroup_lru_info {
> > > -	struct mem_cgroup_per_node *nodeinfo[MAX_NUMNODES];
> > > +	struct mem_cgroup_per_node *nodeinfo[0];
> > 
> > It seems like a VM_BUG_ON() in mem_cgroup_zoneinfo() asserting that the
> > nid index is less than nr_node_ids would be good at catching illegal
> > indexes.  I don't see any illegal indexes in your patch, but I fear that
> > someday a MAX_NUMNODES based for() loop might sneak in.
> 
> Can't hurt ;)
> 
> > Tangential question: why use inline here?  I figure that modern
> > compilers are good at making inlining decisions.
> 
> And that'll save some disk space.
> 
> This?
> 
> --- a/mm/memcontrol.c~memcg-reduce-the-size-of-struct-memcg-244-fold-fix
> +++ a/mm/memcontrol.c
> @@ -381,7 +381,7 @@ enum {
>  		((1 << KMEM_ACCOUNTED_ACTIVE) | (1 << KMEM_ACCOUNTED_ACTIVATED))
>  
>  #ifdef CONFIG_MEMCG_KMEM
> -static inline void memcg_kmem_set_active(struct mem_cgroup *memcg)
> +static void memcg_kmem_set_active(struct mem_cgroup *memcg)
>  {
>  	set_bit(KMEM_ACCOUNTED_ACTIVE, &memcg->kmem_account_flags);
>  }

I don't disapprove, but it's the wrong function for this patch.
Should be memcg_size().

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2] memcg: reduce the size of struct memcg 244-fold.
@ 2013-02-05 18:53   ` Johannes Weiner
  0 siblings, 0 replies; 20+ messages in thread
From: Johannes Weiner @ 2013-02-05 18:53 UTC (permalink / raw)
  To: Glauber Costa
  Cc: linux-mm, cgroups, Michal Hocko, Kamezawa Hiroyuki, Greg Thelen,
	Hugh Dickins, Ying Han, Mel Gorman, Rik van Riel

On Thu, Jan 24, 2013 at 10:46:35AM +0400, Glauber Costa wrote:
> In order to maintain all the memcg bookkeeping, we need per-node
> descriptors, which will in turn contain a per-zone descriptor.
> 
> Because we want to statically allocate those, this array ends up being
> very big. Part of the reason is that we allocate something large enough
> to hold MAX_NUMNODES, the compile time constant that holds the maximum
> number of nodes we would ever consider.
> 
> However, we can do better in some cases if the firmware help us. This is
> true for modern x86 machines; coincidentally one of the architectures in
> which MAX_NUMNODES tends to be very big.
> 
> By using the firmware-provided maximum number of nodes instead of
> MAX_NUMNODES, we can reduce the memory footprint of struct memcg
> considerably. In the extreme case in which we have only one node, this
> reduces the size of the structure from ~ 64k to ~2k. This is
> particularly important because it means that we will no longer resort to
> the vmalloc area for the struct memcg on defconfigs. We also have enough
> room for an extra node and still be outside vmalloc.
> 
> One also has to keep in mind that with the industry's ability to fit
> more processors in a die as fast as the FED prints money, a nodes = 2
> configuration is already respectably big.
> 
> [ v2: use size_t for size calculations ]
> Signed-off-by: Glauber Costa <glommer@parallels.com>
> Cc: Michal Hocko <mhocko@suse.cz>
> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Greg Thelen <gthelen@google.com>
> Cc: Hugh Dickins <hughd@google.com>
> Cc: Ying Han <yinghan@google.com>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Rik van Riel <riel@redhat.com>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

Nitpick:

> @@ -349,8 +338,29 @@ struct mem_cgroup {
>          /* Index in the kmem_cache->memcg_params->memcg_caches array */
>  	int kmemcg_id;
>  #endif
> +
> +	int last_scanned_node;
> +#if MAX_NUMNODES > 1
> +	nodemask_t	scan_nodes;
> +	atomic_t	numainfo_events;
> +	atomic_t	numainfo_updating;
> +#endif
> +	/*
> +	 * Per cgroup active and inactive list, similar to the
> +	 * per zone LRU lists.
> +	 *
> +	 * WARNING: This has to be the last element of the struct. Don't
> +	 * add new fields after this point.
> +	 */
> +	struct mem_cgroup_lru_info info;

I can see myself ignoring comments pertaining to previous members when
adding to a struct.  The indirection through mem_cgroup_lru_info can
probably be dropped anyway, and it moves the [0] in a place where it
helps document the struct mem_cgroup layout.  What do you think about
the following:

---
Subject: [patch] memcg: reduce the size of struct memcg 244-fold morrr fix

Remove struct mem_cgroup_lru_info.  It only holds the nodeinfo array
and is actively misleading because there is all kinds of per-node
stuff in addition to the LRU info in there.  On that note, remove the
incorrect comment as well.

Move comment about the nodeinfo[0] array having to be the last field
in struct mem_cgroup after said array.  Should be more visible when
attempting to append new members to the struct.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 2382fe9..29cb9e9 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -179,10 +179,6 @@ struct mem_cgroup_per_node {
 	struct mem_cgroup_per_zone zoneinfo[MAX_NR_ZONES];
 };
 
-struct mem_cgroup_lru_info {
-	struct mem_cgroup_per_node *nodeinfo[0];
-};
-
 /*
  * Cgroups above their limits are maintained in a RB-Tree, independent of
  * their hierarchy representation
@@ -370,14 +366,8 @@ struct mem_cgroup {
 	atomic_t	numainfo_events;
 	atomic_t	numainfo_updating;
 #endif
-	/*
-	 * Per cgroup active and inactive list, similar to the
-	 * per zone LRU lists.
-	 *
-	 * WARNING: This has to be the last element of the struct. Don't
-	 * add new fields after this point.
-	 */
-	struct mem_cgroup_lru_info info;
+	struct mem_cgroup_per_node *nodeinfo[0];
+	/* WARNING: nodeinfo has to be the last member in here */
 };
 
 static inline size_t memcg_size(void)
@@ -718,7 +708,7 @@ static struct mem_cgroup_per_zone *
 mem_cgroup_zoneinfo(struct mem_cgroup *memcg, int nid, int zid)
 {
 	VM_BUG_ON((unsigned)nid >= nr_node_ids);
-	return &memcg->info.nodeinfo[nid]->zoneinfo[zid];
+	return &memcg->nodeinfo[nid]->zoneinfo[zid];
 }
 
 struct cgroup_subsys_state *mem_cgroup_css(struct mem_cgroup *memcg)
@@ -6093,13 +6083,13 @@ static int alloc_mem_cgroup_per_zone_info(struct mem_cgroup *memcg, int node)
 		mz->on_tree = false;
 		mz->memcg = memcg;
 	}
-	memcg->info.nodeinfo[node] = pn;
+	memcg->nodeinfo[node] = pn;
 	return 0;
 }
 
 static void free_mem_cgroup_per_zone_info(struct mem_cgroup *memcg, int node)
 {
-	kfree(memcg->info.nodeinfo[node]);
+	kfree(memcg->nodeinfo[node]);
 }
 
 static struct mem_cgroup *mem_cgroup_alloc(void)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v2] memcg: reduce the size of struct memcg 244-fold.
@ 2013-02-05 18:53   ` Johannes Weiner
  0 siblings, 0 replies; 20+ messages in thread
From: Johannes Weiner @ 2013-02-05 18:53 UTC (permalink / raw)
  To: Glauber Costa
  Cc: linux-mm-Bw31MaZKKs3YtjvyW6yDsg, cgroups-u79uwXL29TY76Z2rM5mHXA,
	Michal Hocko, Kamezawa Hiroyuki, Greg Thelen, Hugh Dickins,
	Ying Han, Mel Gorman, Rik van Riel

On Thu, Jan 24, 2013 at 10:46:35AM +0400, Glauber Costa wrote:
> In order to maintain all the memcg bookkeeping, we need per-node
> descriptors, which will in turn contain a per-zone descriptor.
> 
> Because we want to statically allocate those, this array ends up being
> very big. Part of the reason is that we allocate something large enough
> to hold MAX_NUMNODES, the compile time constant that holds the maximum
> number of nodes we would ever consider.
> 
> However, we can do better in some cases if the firmware help us. This is
> true for modern x86 machines; coincidentally one of the architectures in
> which MAX_NUMNODES tends to be very big.
> 
> By using the firmware-provided maximum number of nodes instead of
> MAX_NUMNODES, we can reduce the memory footprint of struct memcg
> considerably. In the extreme case in which we have only one node, this
> reduces the size of the structure from ~ 64k to ~2k. This is
> particularly important because it means that we will no longer resort to
> the vmalloc area for the struct memcg on defconfigs. We also have enough
> room for an extra node and still be outside vmalloc.
> 
> One also has to keep in mind that with the industry's ability to fit
> more processors in a die as fast as the FED prints money, a nodes = 2
> configuration is already respectably big.
> 
> [ v2: use size_t for size calculations ]
> Signed-off-by: Glauber Costa <glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
> Cc: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org>
> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
> Cc: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
> Cc: Greg Thelen <gthelen-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> Cc: Hugh Dickins <hughd-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> Cc: Ying Han <yinghan-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> Cc: Mel Gorman <mgorman-l3A5Bk7waGM@public.gmane.org>
> Cc: Rik van Riel <riel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

Acked-by: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>

Nitpick:

> @@ -349,8 +338,29 @@ struct mem_cgroup {
>          /* Index in the kmem_cache->memcg_params->memcg_caches array */
>  	int kmemcg_id;
>  #endif
> +
> +	int last_scanned_node;
> +#if MAX_NUMNODES > 1
> +	nodemask_t	scan_nodes;
> +	atomic_t	numainfo_events;
> +	atomic_t	numainfo_updating;
> +#endif
> +	/*
> +	 * Per cgroup active and inactive list, similar to the
> +	 * per zone LRU lists.
> +	 *
> +	 * WARNING: This has to be the last element of the struct. Don't
> +	 * add new fields after this point.
> +	 */
> +	struct mem_cgroup_lru_info info;

I can see myself ignoring comments pertaining to previous members when
adding to a struct.  The indirection through mem_cgroup_lru_info can
probably be dropped anyway, and it moves the [0] in a place where it
helps document the struct mem_cgroup layout.  What do you think about
the following:

---
Subject: [patch] memcg: reduce the size of struct memcg 244-fold morrr fix

Remove struct mem_cgroup_lru_info.  It only holds the nodeinfo array
and is actively misleading because there is all kinds of per-node
stuff in addition to the LRU info in there.  On that note, remove the
incorrect comment as well.

Move comment about the nodeinfo[0] array having to be the last field
in struct mem_cgroup after said array.  Should be more visible when
attempting to append new members to the struct.

Signed-off-by: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
---

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 2382fe9..29cb9e9 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -179,10 +179,6 @@ struct mem_cgroup_per_node {
 	struct mem_cgroup_per_zone zoneinfo[MAX_NR_ZONES];
 };
 
-struct mem_cgroup_lru_info {
-	struct mem_cgroup_per_node *nodeinfo[0];
-};
-
 /*
  * Cgroups above their limits are maintained in a RB-Tree, independent of
  * their hierarchy representation
@@ -370,14 +366,8 @@ struct mem_cgroup {
 	atomic_t	numainfo_events;
 	atomic_t	numainfo_updating;
 #endif
-	/*
-	 * Per cgroup active and inactive list, similar to the
-	 * per zone LRU lists.
-	 *
-	 * WARNING: This has to be the last element of the struct. Don't
-	 * add new fields after this point.
-	 */
-	struct mem_cgroup_lru_info info;
+	struct mem_cgroup_per_node *nodeinfo[0];
+	/* WARNING: nodeinfo has to be the last member in here */
 };
 
 static inline size_t memcg_size(void)
@@ -718,7 +708,7 @@ static struct mem_cgroup_per_zone *
 mem_cgroup_zoneinfo(struct mem_cgroup *memcg, int nid, int zid)
 {
 	VM_BUG_ON((unsigned)nid >= nr_node_ids);
-	return &memcg->info.nodeinfo[nid]->zoneinfo[zid];
+	return &memcg->nodeinfo[nid]->zoneinfo[zid];
 }
 
 struct cgroup_subsys_state *mem_cgroup_css(struct mem_cgroup *memcg)
@@ -6093,13 +6083,13 @@ static int alloc_mem_cgroup_per_zone_info(struct mem_cgroup *memcg, int node)
 		mz->on_tree = false;
 		mz->memcg = memcg;
 	}
-	memcg->info.nodeinfo[node] = pn;
+	memcg->nodeinfo[node] = pn;
 	return 0;
 }
 
 static void free_mem_cgroup_per_zone_info(struct mem_cgroup *memcg, int node)
 {
-	kfree(memcg->info.nodeinfo[node]);
+	kfree(memcg->nodeinfo[node]);
 }
 
 static struct mem_cgroup *mem_cgroup_alloc(void)

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v2] memcg: reduce the size of struct memcg 244-fold.
@ 2013-02-05 19:04     ` Michal Hocko
  0 siblings, 0 replies; 20+ messages in thread
From: Michal Hocko @ 2013-02-05 19:04 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Glauber Costa, linux-mm, cgroups, Kamezawa Hiroyuki, Greg Thelen,
	Hugh Dickins, Ying Han, Mel Gorman, Rik van Riel

On Tue 05-02-13 13:53:24, Johannes Weiner wrote:
[...]
> Subject: [patch] memcg: reduce the size of struct memcg 244-fold morrr fix
> 
> Remove struct mem_cgroup_lru_info.  It only holds the nodeinfo array
> and is actively misleading because there is all kinds of per-node
> stuff in addition to the LRU info in there.  On that note, remove the
> incorrect comment as well.
> 
> Move comment about the nodeinfo[0] array having to be the last field
> in struct mem_cgroup after said array.  Should be more visible when
> attempting to append new members to the struct.
> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

Yes, I like this. The info level is just artificatial and without any
good reason.

Acked-by: Michal Hocko <mhocko@suse.cz>

Thanks

> ---
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 2382fe9..29cb9e9 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -179,10 +179,6 @@ struct mem_cgroup_per_node {
>  	struct mem_cgroup_per_zone zoneinfo[MAX_NR_ZONES];
>  };
>  
> -struct mem_cgroup_lru_info {
> -	struct mem_cgroup_per_node *nodeinfo[0];
> -};
> -
>  /*
>   * Cgroups above their limits are maintained in a RB-Tree, independent of
>   * their hierarchy representation
> @@ -370,14 +366,8 @@ struct mem_cgroup {
>  	atomic_t	numainfo_events;
>  	atomic_t	numainfo_updating;
>  #endif
> -	/*
> -	 * Per cgroup active and inactive list, similar to the
> -	 * per zone LRU lists.
> -	 *
> -	 * WARNING: This has to be the last element of the struct. Don't
> -	 * add new fields after this point.
> -	 */
> -	struct mem_cgroup_lru_info info;
> +	struct mem_cgroup_per_node *nodeinfo[0];
> +	/* WARNING: nodeinfo has to be the last member in here */
>  };
>  
>  static inline size_t memcg_size(void)
> @@ -718,7 +708,7 @@ static struct mem_cgroup_per_zone *
>  mem_cgroup_zoneinfo(struct mem_cgroup *memcg, int nid, int zid)
>  {
>  	VM_BUG_ON((unsigned)nid >= nr_node_ids);
> -	return &memcg->info.nodeinfo[nid]->zoneinfo[zid];
> +	return &memcg->nodeinfo[nid]->zoneinfo[zid];
>  }
>  
>  struct cgroup_subsys_state *mem_cgroup_css(struct mem_cgroup *memcg)
> @@ -6093,13 +6083,13 @@ static int alloc_mem_cgroup_per_zone_info(struct mem_cgroup *memcg, int node)
>  		mz->on_tree = false;
>  		mz->memcg = memcg;
>  	}
> -	memcg->info.nodeinfo[node] = pn;
> +	memcg->nodeinfo[node] = pn;
>  	return 0;
>  }
>  
>  static void free_mem_cgroup_per_zone_info(struct mem_cgroup *memcg, int node)
>  {
> -	kfree(memcg->info.nodeinfo[node]);
> +	kfree(memcg->nodeinfo[node]);
>  }
>  
>  static struct mem_cgroup *mem_cgroup_alloc(void)
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2] memcg: reduce the size of struct memcg 244-fold.
@ 2013-02-05 19:04     ` Michal Hocko
  0 siblings, 0 replies; 20+ messages in thread
From: Michal Hocko @ 2013-02-05 19:04 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Glauber Costa, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	cgroups-u79uwXL29TY76Z2rM5mHXA, Kamezawa Hiroyuki, Greg Thelen,
	Hugh Dickins, Ying Han, Mel Gorman, Rik van Riel

On Tue 05-02-13 13:53:24, Johannes Weiner wrote:
[...]
> Subject: [patch] memcg: reduce the size of struct memcg 244-fold morrr fix
> 
> Remove struct mem_cgroup_lru_info.  It only holds the nodeinfo array
> and is actively misleading because there is all kinds of per-node
> stuff in addition to the LRU info in there.  On that note, remove the
> incorrect comment as well.
> 
> Move comment about the nodeinfo[0] array having to be the last field
> in struct mem_cgroup after said array.  Should be more visible when
> attempting to append new members to the struct.
> 
> Signed-off-by: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>

Yes, I like this. The info level is just artificatial and without any
good reason.

Acked-by: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org>

Thanks

> ---
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 2382fe9..29cb9e9 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -179,10 +179,6 @@ struct mem_cgroup_per_node {
>  	struct mem_cgroup_per_zone zoneinfo[MAX_NR_ZONES];
>  };
>  
> -struct mem_cgroup_lru_info {
> -	struct mem_cgroup_per_node *nodeinfo[0];
> -};
> -
>  /*
>   * Cgroups above their limits are maintained in a RB-Tree, independent of
>   * their hierarchy representation
> @@ -370,14 +366,8 @@ struct mem_cgroup {
>  	atomic_t	numainfo_events;
>  	atomic_t	numainfo_updating;
>  #endif
> -	/*
> -	 * Per cgroup active and inactive list, similar to the
> -	 * per zone LRU lists.
> -	 *
> -	 * WARNING: This has to be the last element of the struct. Don't
> -	 * add new fields after this point.
> -	 */
> -	struct mem_cgroup_lru_info info;
> +	struct mem_cgroup_per_node *nodeinfo[0];
> +	/* WARNING: nodeinfo has to be the last member in here */
>  };
>  
>  static inline size_t memcg_size(void)
> @@ -718,7 +708,7 @@ static struct mem_cgroup_per_zone *
>  mem_cgroup_zoneinfo(struct mem_cgroup *memcg, int nid, int zid)
>  {
>  	VM_BUG_ON((unsigned)nid >= nr_node_ids);
> -	return &memcg->info.nodeinfo[nid]->zoneinfo[zid];
> +	return &memcg->nodeinfo[nid]->zoneinfo[zid];
>  }
>  
>  struct cgroup_subsys_state *mem_cgroup_css(struct mem_cgroup *memcg)
> @@ -6093,13 +6083,13 @@ static int alloc_mem_cgroup_per_zone_info(struct mem_cgroup *memcg, int node)
>  		mz->on_tree = false;
>  		mz->memcg = memcg;
>  	}
> -	memcg->info.nodeinfo[node] = pn;
> +	memcg->nodeinfo[node] = pn;
>  	return 0;
>  }
>  
>  static void free_mem_cgroup_per_zone_info(struct mem_cgroup *memcg, int node)
>  {
> -	kfree(memcg->info.nodeinfo[node]);
> +	kfree(memcg->nodeinfo[node]);
>  }
>  
>  static struct mem_cgroup *mem_cgroup_alloc(void)
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo-Bw31MaZKKs0EbZ0PF+XxCw@public.gmane.org  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org"> email-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org </a>

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2] memcg: reduce the size of struct memcg 244-fold.
@ 2013-02-05 19:06       ` Glauber Costa
  0 siblings, 0 replies; 20+ messages in thread
From: Glauber Costa @ 2013-02-05 19:06 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Johannes Weiner, linux-mm, cgroups, Kamezawa Hiroyuki,
	Greg Thelen, Hugh Dickins, Ying Han, Mel Gorman, Rik van Riel

On 02/05/2013 11:04 PM, Michal Hocko wrote:
> On Tue 05-02-13 13:53:24, Johannes Weiner wrote:
> [...]
>> Subject: [patch] memcg: reduce the size of struct memcg 244-fold morrr fix
>>
>> Remove struct mem_cgroup_lru_info.  It only holds the nodeinfo array
>> and is actively misleading because there is all kinds of per-node
>> stuff in addition to the LRU info in there.  On that note, remove the
>> incorrect comment as well.
>>
>> Move comment about the nodeinfo[0] array having to be the last field
>> in struct mem_cgroup after said array.  Should be more visible when
>> attempting to append new members to the struct.
>>
>> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> 
> Yes, I like this. The info level is just artificatial and without any
> good reason.
> 
> Acked-by: Michal Hocko <mhocko@suse.cz>
> 
Acked-by: Glauber Costa <glommer@parallels.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2] memcg: reduce the size of struct memcg 244-fold.
@ 2013-02-05 19:06       ` Glauber Costa
  0 siblings, 0 replies; 20+ messages in thread
From: Glauber Costa @ 2013-02-05 19:06 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Johannes Weiner, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	cgroups-u79uwXL29TY76Z2rM5mHXA, Kamezawa Hiroyuki, Greg Thelen,
	Hugh Dickins, Ying Han, Mel Gorman, Rik van Riel

On 02/05/2013 11:04 PM, Michal Hocko wrote:
> On Tue 05-02-13 13:53:24, Johannes Weiner wrote:
> [...]
>> Subject: [patch] memcg: reduce the size of struct memcg 244-fold morrr fix
>>
>> Remove struct mem_cgroup_lru_info.  It only holds the nodeinfo array
>> and is actively misleading because there is all kinds of per-node
>> stuff in addition to the LRU info in there.  On that note, remove the
>> incorrect comment as well.
>>
>> Move comment about the nodeinfo[0] array having to be the last field
>> in struct mem_cgroup after said array.  Should be more visible when
>> attempting to append new members to the struct.
>>
>> Signed-off-by: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
> 
> Yes, I like this. The info level is just artificatial and without any
> good reason.
> 
> Acked-by: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org>
> 
Acked-by: Glauber Costa <glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2013-02-05 19:06 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-01-24  6:46 [PATCH v2] memcg: reduce the size of struct memcg 244-fold Glauber Costa
2013-01-24  6:46 ` Glauber Costa
2013-01-24  7:50 ` Greg Thelen
2013-01-24  7:52   ` Glauber Costa
2013-01-24  7:52     ` Glauber Costa
2013-01-24 23:51   ` Andrew Morton
2013-01-24 23:51     ` Andrew Morton
2013-01-25  7:37     ` Lord Glauber Costa of Sealand
2013-01-25 17:14     ` Greg Thelen
2013-02-05 18:37     ` Johannes Weiner
2013-01-24 10:14 ` Michal Hocko
2013-01-24 10:14   ` Michal Hocko
2013-01-29  0:08 ` Kamezawa Hiroyuki
2013-01-29  0:08   ` Kamezawa Hiroyuki
2013-02-05 18:53 ` Johannes Weiner
2013-02-05 18:53   ` Johannes Weiner
2013-02-05 19:04   ` Michal Hocko
2013-02-05 19:04     ` Michal Hocko
2013-02-05 19:06     ` Glauber Costa
2013-02-05 19:06       ` Glauber Costa

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.