All of lore.kernel.org
 help / color / mirror / Atom feed
* [patch] mm: memcontrol: do not iterate uninitialized memcgs
@ 2014-09-25  2:31 ` Johannes Weiner
  0 siblings, 0 replies; 23+ messages in thread
From: Johannes Weiner @ 2014-09-25  2:31 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Hugh Dickins, Tejun Heo, Michal Hocko, linux-mm, cgroups, linux-kernel

The cgroup iterators yield css objects that have not yet gone through
css_online(), but they are not complete memcgs at this point and so
the memcg iterators should not return them.  d8ad30559715 ("mm/memcg:
iteration skip memcgs not yet fully initialized") set out to implement
exactly this, but it uses CSS_ONLINE, a cgroup-internal flag that does
not meet the ordering requirements for memcg, and so we still may see
partially initialized memcgs from the iterators.

The cgroup core can not reasonably provide a clear answer on whether
the object around the css has been fully initialized, as that depends
on controller-specific locking and lifetime rules.  Thus, introduce a
memcg-specific flag that is set after the memcg has been initialized
in css_online(), and read before mem_cgroup_iter() callers access the
memcg members.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: <stable@vger.kernel.org>	[3.12+]
---
 mm/memcontrol.c | 35 ++++++++++++++++++++++++++++++-----
 1 file changed, 30 insertions(+), 5 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 306b6470784c..71ed15e3a148 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -292,6 +292,9 @@ struct mem_cgroup {
 	/* vmpressure notifications */
 	struct vmpressure vmpressure;
 
+	/* css_online() has been completed */
+	bool initialized;
+
 	/*
 	 * the counter to account for mem+swap usage.
 	 */
@@ -1090,10 +1093,22 @@ skip_node:
 	 * skipping css reference should be safe.
 	 */
 	if (next_css) {
-		if ((next_css == &root->css) ||
-		    ((next_css->flags & CSS_ONLINE) &&
-		     css_tryget_online(next_css)))
-			return mem_cgroup_from_css(next_css);
+		if (next_css == &root->css ||
+		    css_tryget_online(next_css)) {
+			struct mem_cgroup *memcg;
+
+			memcg = mem_cgroup_from_css(next_css);
+			if (memcg->initialized) {
+				/*
+				 * Make sure the caller's accesses to
+				 * the memcg members are issued after
+				 * we see this flag set.
+				 */
+				smp_rmb();
+				return memcg;
+			}
+			css_put(next_css);
+		}
 
 		prev_css = next_css;
 		goto skip_node;
@@ -5413,6 +5428,7 @@ mem_cgroup_css_online(struct cgroup_subsys_state *css)
 {
 	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
 	struct mem_cgroup *parent = mem_cgroup_from_css(css->parent);
+	int ret;
 
 	if (css->id > MEM_CGROUP_ID_MAX)
 		return -ENOSPC;
@@ -5449,7 +5465,16 @@ mem_cgroup_css_online(struct cgroup_subsys_state *css)
 	}
 	mutex_unlock(&memcg_create_mutex);
 
-	return memcg_init_kmem(memcg, &memory_cgrp_subsys);
+	ret = memcg_init_kmem(memcg, &memory_cgrp_subsys);
+	if (ret)
+		return ret;
+
+	/* Make sure the initialization is visible before the flag */
+	smp_wmb();
+
+	memcg->initialized = true;
+
+	return 0;
 }
 
 /*
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [patch] mm: memcontrol: do not iterate uninitialized memcgs
@ 2014-09-25  2:31 ` Johannes Weiner
  0 siblings, 0 replies; 23+ messages in thread
From: Johannes Weiner @ 2014-09-25  2:31 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Hugh Dickins, Tejun Heo, Michal Hocko, linux-mm, cgroups, linux-kernel

The cgroup iterators yield css objects that have not yet gone through
css_online(), but they are not complete memcgs at this point and so
the memcg iterators should not return them.  d8ad30559715 ("mm/memcg:
iteration skip memcgs not yet fully initialized") set out to implement
exactly this, but it uses CSS_ONLINE, a cgroup-internal flag that does
not meet the ordering requirements for memcg, and so we still may see
partially initialized memcgs from the iterators.

The cgroup core can not reasonably provide a clear answer on whether
the object around the css has been fully initialized, as that depends
on controller-specific locking and lifetime rules.  Thus, introduce a
memcg-specific flag that is set after the memcg has been initialized
in css_online(), and read before mem_cgroup_iter() callers access the
memcg members.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: <stable@vger.kernel.org>	[3.12+]
---
 mm/memcontrol.c | 35 ++++++++++++++++++++++++++++++-----
 1 file changed, 30 insertions(+), 5 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 306b6470784c..71ed15e3a148 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -292,6 +292,9 @@ struct mem_cgroup {
 	/* vmpressure notifications */
 	struct vmpressure vmpressure;
 
+	/* css_online() has been completed */
+	bool initialized;
+
 	/*
 	 * the counter to account for mem+swap usage.
 	 */
@@ -1090,10 +1093,22 @@ skip_node:
 	 * skipping css reference should be safe.
 	 */
 	if (next_css) {
-		if ((next_css == &root->css) ||
-		    ((next_css->flags & CSS_ONLINE) &&
-		     css_tryget_online(next_css)))
-			return mem_cgroup_from_css(next_css);
+		if (next_css == &root->css ||
+		    css_tryget_online(next_css)) {
+			struct mem_cgroup *memcg;
+
+			memcg = mem_cgroup_from_css(next_css);
+			if (memcg->initialized) {
+				/*
+				 * Make sure the caller's accesses to
+				 * the memcg members are issued after
+				 * we see this flag set.
+				 */
+				smp_rmb();
+				return memcg;
+			}
+			css_put(next_css);
+		}
 
 		prev_css = next_css;
 		goto skip_node;
@@ -5413,6 +5428,7 @@ mem_cgroup_css_online(struct cgroup_subsys_state *css)
 {
 	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
 	struct mem_cgroup *parent = mem_cgroup_from_css(css->parent);
+	int ret;
 
 	if (css->id > MEM_CGROUP_ID_MAX)
 		return -ENOSPC;
@@ -5449,7 +5465,16 @@ mem_cgroup_css_online(struct cgroup_subsys_state *css)
 	}
 	mutex_unlock(&memcg_create_mutex);
 
-	return memcg_init_kmem(memcg, &memory_cgrp_subsys);
+	ret = memcg_init_kmem(memcg, &memory_cgrp_subsys);
+	if (ret)
+		return ret;
+
+	/* Make sure the initialization is visible before the flag */
+	smp_wmb();
+
+	memcg->initialized = true;
+
+	return 0;
 }
 
 /*
-- 
2.1.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [patch] mm: memcontrol: do not iterate uninitialized memcgs
@ 2014-09-25  2:31 ` Johannes Weiner
  0 siblings, 0 replies; 23+ messages in thread
From: Johannes Weiner @ 2014-09-25  2:31 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Hugh Dickins, Tejun Heo, Michal Hocko,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

The cgroup iterators yield css objects that have not yet gone through
css_online(), but they are not complete memcgs at this point and so
the memcg iterators should not return them.  d8ad30559715 ("mm/memcg:
iteration skip memcgs not yet fully initialized") set out to implement
exactly this, but it uses CSS_ONLINE, a cgroup-internal flag that does
not meet the ordering requirements for memcg, and so we still may see
partially initialized memcgs from the iterators.

The cgroup core can not reasonably provide a clear answer on whether
the object around the css has been fully initialized, as that depends
on controller-specific locking and lifetime rules.  Thus, introduce a
memcg-specific flag that is set after the memcg has been initialized
in css_online(), and read before mem_cgroup_iter() callers access the
memcg members.

Signed-off-by: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
Cc: <stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>	[3.12+]
---
 mm/memcontrol.c | 35 ++++++++++++++++++++++++++++++-----
 1 file changed, 30 insertions(+), 5 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 306b6470784c..71ed15e3a148 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -292,6 +292,9 @@ struct mem_cgroup {
 	/* vmpressure notifications */
 	struct vmpressure vmpressure;
 
+	/* css_online() has been completed */
+	bool initialized;
+
 	/*
 	 * the counter to account for mem+swap usage.
 	 */
@@ -1090,10 +1093,22 @@ skip_node:
 	 * skipping css reference should be safe.
 	 */
 	if (next_css) {
-		if ((next_css == &root->css) ||
-		    ((next_css->flags & CSS_ONLINE) &&
-		     css_tryget_online(next_css)))
-			return mem_cgroup_from_css(next_css);
+		if (next_css == &root->css ||
+		    css_tryget_online(next_css)) {
+			struct mem_cgroup *memcg;
+
+			memcg = mem_cgroup_from_css(next_css);
+			if (memcg->initialized) {
+				/*
+				 * Make sure the caller's accesses to
+				 * the memcg members are issued after
+				 * we see this flag set.
+				 */
+				smp_rmb();
+				return memcg;
+			}
+			css_put(next_css);
+		}
 
 		prev_css = next_css;
 		goto skip_node;
@@ -5413,6 +5428,7 @@ mem_cgroup_css_online(struct cgroup_subsys_state *css)
 {
 	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
 	struct mem_cgroup *parent = mem_cgroup_from_css(css->parent);
+	int ret;
 
 	if (css->id > MEM_CGROUP_ID_MAX)
 		return -ENOSPC;
@@ -5449,7 +5465,16 @@ mem_cgroup_css_online(struct cgroup_subsys_state *css)
 	}
 	mutex_unlock(&memcg_create_mutex);
 
-	return memcg_init_kmem(memcg, &memory_cgrp_subsys);
+	ret = memcg_init_kmem(memcg, &memory_cgrp_subsys);
+	if (ret)
+		return ret;
+
+	/* Make sure the initialization is visible before the flag */
+	smp_wmb();
+
+	memcg->initialized = true;
+
+	return 0;
 }
 
 /*
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [patch v2] mm: memcontrol: do not iterate uninitialized memcgs
  2014-09-25  2:31 ` Johannes Weiner
  (?)
@ 2014-09-25  2:40   ` Johannes Weiner
  -1 siblings, 0 replies; 23+ messages in thread
From: Johannes Weiner @ 2014-09-25  2:40 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Hugh Dickins, Tejun Heo, Michal Hocko, linux-mm, cgroups, linux-kernel

Argh, buggy css_put() against the root.  Hand grenades, everywhere.
Update:

---
>From 9b0b4d72d71cd8acd7aaa58d2006c751decc8739 Mon Sep 17 00:00:00 2001
From: Johannes Weiner <hannes@cmpxchg.org>
Date: Wed, 24 Sep 2014 22:00:20 -0400
Subject: [patch] mm: memcontrol: do not iterate uninitialized memcgs

The cgroup iterators yield css objects that have not yet gone through
css_online(), but they are not complete memcgs at this point and so
the memcg iterators should not return them.  d8ad30559715 ("mm/memcg:
iteration skip memcgs not yet fully initialized") set out to implement
exactly this, but it uses CSS_ONLINE, a cgroup-internal flag that does
not meet the ordering requirements for memcg, and so we still may see
partially initialized memcgs from the iterators.

The cgroup core can not reasonably provide a clear answer on whether
the object around the css has been fully initialized, as that depends
on controller-specific locking and lifetime rules.  Thus, introduce a
memcg-specific flag that is set after the memcg has been initialized
in css_online(), and read before mem_cgroup_iter() callers access the
memcg members.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: <stable@vger.kernel.org>	[3.12+]
---
 mm/memcontrol.c | 36 +++++++++++++++++++++++++++++++-----
 1 file changed, 31 insertions(+), 5 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 306b6470784c..bafdac0f724e 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -292,6 +292,9 @@ struct mem_cgroup {
 	/* vmpressure notifications */
 	struct vmpressure vmpressure;
 
+	/* css_online() has been completed */
+	bool initialized;
+
 	/*
 	 * the counter to account for mem+swap usage.
 	 */
@@ -1090,10 +1093,23 @@ skip_node:
 	 * skipping css reference should be safe.
 	 */
 	if (next_css) {
-		if ((next_css == &root->css) ||
-		    ((next_css->flags & CSS_ONLINE) &&
-		     css_tryget_online(next_css)))
-			return mem_cgroup_from_css(next_css);
+		struct mem_cgroup *memcg = mem_cgroup_from_css(next_css);
+
+		if (next_css == &root->css)
+			return memcg;
+
+		if (css_tryget_online(next_css)) {
+			if (memcg->initialized) {
+				/*
+				 * Make sure the caller's accesses to
+				 * the memcg members are issued after
+				 * we see this flag set.
+				 */
+				smp_rmb();
+				return memcg;
+			}
+			css_put(next_css);
+		}
 
 		prev_css = next_css;
 		goto skip_node;
@@ -5413,6 +5429,7 @@ mem_cgroup_css_online(struct cgroup_subsys_state *css)
 {
 	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
 	struct mem_cgroup *parent = mem_cgroup_from_css(css->parent);
+	int ret;
 
 	if (css->id > MEM_CGROUP_ID_MAX)
 		return -ENOSPC;
@@ -5449,7 +5466,16 @@ mem_cgroup_css_online(struct cgroup_subsys_state *css)
 	}
 	mutex_unlock(&memcg_create_mutex);
 
-	return memcg_init_kmem(memcg, &memory_cgrp_subsys);
+	ret = memcg_init_kmem(memcg, &memory_cgrp_subsys);
+	if (ret)
+		return ret;
+
+	/* Make sure the initialization is visible before the flag */
+	smp_wmb();
+
+	memcg->initialized = true;
+
+	return 0;
 }
 
 /*
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [patch v2] mm: memcontrol: do not iterate uninitialized memcgs
@ 2014-09-25  2:40   ` Johannes Weiner
  0 siblings, 0 replies; 23+ messages in thread
From: Johannes Weiner @ 2014-09-25  2:40 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Hugh Dickins, Tejun Heo, Michal Hocko, linux-mm, cgroups, linux-kernel

Argh, buggy css_put() against the root.  Hand grenades, everywhere.
Update:

---

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [patch v2] mm: memcontrol: do not iterate uninitialized memcgs
@ 2014-09-25  2:40   ` Johannes Weiner
  0 siblings, 0 replies; 23+ messages in thread
From: Johannes Weiner @ 2014-09-25  2:40 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Hugh Dickins, Tejun Heo, Michal Hocko, linux-mm, cgroups, linux-kernel

Argh, buggy css_put() against the root.  Hand grenades, everywhere.
Update:

---
From 9b0b4d72d71cd8acd7aaa58d2006c751decc8739 Mon Sep 17 00:00:00 2001
From: Johannes Weiner <hannes@cmpxchg.org>
Date: Wed, 24 Sep 2014 22:00:20 -0400
Subject: [patch] mm: memcontrol: do not iterate uninitialized memcgs

The cgroup iterators yield css objects that have not yet gone through
css_online(), but they are not complete memcgs at this point and so
the memcg iterators should not return them.  d8ad30559715 ("mm/memcg:
iteration skip memcgs not yet fully initialized") set out to implement
exactly this, but it uses CSS_ONLINE, a cgroup-internal flag that does
not meet the ordering requirements for memcg, and so we still may see
partially initialized memcgs from the iterators.

The cgroup core can not reasonably provide a clear answer on whether
the object around the css has been fully initialized, as that depends
on controller-specific locking and lifetime rules.  Thus, introduce a
memcg-specific flag that is set after the memcg has been initialized
in css_online(), and read before mem_cgroup_iter() callers access the
memcg members.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: <stable@vger.kernel.org>	[3.12+]
---
 mm/memcontrol.c | 36 +++++++++++++++++++++++++++++++-----
 1 file changed, 31 insertions(+), 5 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 306b6470784c..bafdac0f724e 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -292,6 +292,9 @@ struct mem_cgroup {
 	/* vmpressure notifications */
 	struct vmpressure vmpressure;
 
+	/* css_online() has been completed */
+	bool initialized;
+
 	/*
 	 * the counter to account for mem+swap usage.
 	 */
@@ -1090,10 +1093,23 @@ skip_node:
 	 * skipping css reference should be safe.
 	 */
 	if (next_css) {
-		if ((next_css == &root->css) ||
-		    ((next_css->flags & CSS_ONLINE) &&
-		     css_tryget_online(next_css)))
-			return mem_cgroup_from_css(next_css);
+		struct mem_cgroup *memcg = mem_cgroup_from_css(next_css);
+
+		if (next_css == &root->css)
+			return memcg;
+
+		if (css_tryget_online(next_css)) {
+			if (memcg->initialized) {
+				/*
+				 * Make sure the caller's accesses to
+				 * the memcg members are issued after
+				 * we see this flag set.
+				 */
+				smp_rmb();
+				return memcg;
+			}
+			css_put(next_css);
+		}
 
 		prev_css = next_css;
 		goto skip_node;
@@ -5413,6 +5429,7 @@ mem_cgroup_css_online(struct cgroup_subsys_state *css)
 {
 	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
 	struct mem_cgroup *parent = mem_cgroup_from_css(css->parent);
+	int ret;
 
 	if (css->id > MEM_CGROUP_ID_MAX)
 		return -ENOSPC;
@@ -5449,7 +5466,16 @@ mem_cgroup_css_online(struct cgroup_subsys_state *css)
 	}
 	mutex_unlock(&memcg_create_mutex);
 
-	return memcg_init_kmem(memcg, &memory_cgrp_subsys);
+	ret = memcg_init_kmem(memcg, &memory_cgrp_subsys);
+	if (ret)
+		return ret;
+
+	/* Make sure the initialization is visible before the flag */
+	smp_wmb();
+
+	memcg->initialized = true;
+
+	return 0;
 }
 
 /*
-- 
2.1.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [patch] mm: memcontrol: do not iterate uninitialized memcgs
  2014-09-25  2:31 ` Johannes Weiner
@ 2014-09-25  2:57   ` Tejun Heo
  -1 siblings, 0 replies; 23+ messages in thread
From: Tejun Heo @ 2014-09-25  2:57 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Hugh Dickins, Michal Hocko, linux-mm, cgroups,
	linux-kernel

Hello,

On Wed, Sep 24, 2014 at 10:31:18PM -0400, Johannes Weiner wrote:
..
> not meet the ordering requirements for memcg, and so we still may see
> partially initialized memcgs from the iterators.

It's mainly the other way around - a fully initialized css may not
show up in an iteration, but given that there's no memory ordering or
synchronization around the flag, anything can happen.

...
> +		if (next_css == &root->css ||
> +		    css_tryget_online(next_css)) {
> +			struct mem_cgroup *memcg;
> +
> +			memcg = mem_cgroup_from_css(next_css);
> +			if (memcg->initialized) {
> +				/*
> +				 * Make sure the caller's accesses to
> +				 * the memcg members are issued after
> +				 * we see this flag set.

I usually prefer if the comment points to the exact location that the
matching memory barriers live.  Sometimes it's difficult to locate the
partner barrier even w/ the functional explanation.

> +				 */
> +				smp_rmb();
> +				return memcg;

In an unlikely event this rmb becomes an issue, a self-pointing
pointer which is set/read using smp_store_release() and
smp_load_acquire() respectively can do with plain barrier() on the
reader side on archs which don't need data dependency barrier
(basically everything except alpha).  Not sure whether that'd be more
or less readable than this tho.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch] mm: memcontrol: do not iterate uninitialized memcgs
@ 2014-09-25  2:57   ` Tejun Heo
  0 siblings, 0 replies; 23+ messages in thread
From: Tejun Heo @ 2014-09-25  2:57 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Hugh Dickins, Michal Hocko, linux-mm, cgroups,
	linux-kernel

Hello,

On Wed, Sep 24, 2014 at 10:31:18PM -0400, Johannes Weiner wrote:
..
> not meet the ordering requirements for memcg, and so we still may see
> partially initialized memcgs from the iterators.

It's mainly the other way around - a fully initialized css may not
show up in an iteration, but given that there's no memory ordering or
synchronization around the flag, anything can happen.

...
> +		if (next_css == &root->css ||
> +		    css_tryget_online(next_css)) {
> +			struct mem_cgroup *memcg;
> +
> +			memcg = mem_cgroup_from_css(next_css);
> +			if (memcg->initialized) {
> +				/*
> +				 * Make sure the caller's accesses to
> +				 * the memcg members are issued after
> +				 * we see this flag set.

I usually prefer if the comment points to the exact location that the
matching memory barriers live.  Sometimes it's difficult to locate the
partner barrier even w/ the functional explanation.

> +				 */
> +				smp_rmb();
> +				return memcg;

In an unlikely event this rmb becomes an issue, a self-pointing
pointer which is set/read using smp_store_release() and
smp_load_acquire() respectively can do with plain barrier() on the
reader side on archs which don't need data dependency barrier
(basically everything except alpha).  Not sure whether that'd be more
or less readable than this tho.

Thanks.

-- 
tejun

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch v2] mm: memcontrol: do not iterate uninitialized memcgs
  2014-09-25  2:40   ` Johannes Weiner
@ 2014-09-25 11:43     ` Michal Hocko
  -1 siblings, 0 replies; 23+ messages in thread
From: Michal Hocko @ 2014-09-25 11:43 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Hugh Dickins, Tejun Heo, linux-mm, cgroups, linux-kernel

On Wed 24-09-14 22:40:55, Johannes Weiner wrote:
> Argh, buggy css_put() against the root.  Hand grenades, everywhere.
> Update:
> 
> ---
> From 9b0b4d72d71cd8acd7aaa58d2006c751decc8739 Mon Sep 17 00:00:00 2001
> From: Johannes Weiner <hannes@cmpxchg.org>
> Date: Wed, 24 Sep 2014 22:00:20 -0400
> Subject: [patch] mm: memcontrol: do not iterate uninitialized memcgs
> 
> The cgroup iterators yield css objects that have not yet gone through
> css_online(), but they are not complete memcgs at this point and so
> the memcg iterators should not return them.  d8ad30559715 ("mm/memcg:
> iteration skip memcgs not yet fully initialized") set out to implement
> exactly this, but it uses CSS_ONLINE, a cgroup-internal flag that does
> not meet the ordering requirements for memcg, and so we still may see
> partially initialized memcgs from the iterators.

I do not see how would this happen. CSS_ONLINE is set after css_online
callback returns and mem_cgroup_css_online ends the core initialization
with mutex_unlock which should provide sufficient memory ordering
requirements (kmem is not covered but activate_kmem_mutex kmem.tcp by
proto_list_mutex). So the worst thing that might happen is that we miss
an already initialized memcg but that shouldn't matter because such a
memcg doesn't contain any tasks nor memory. memcg_has_children doesn't
rely on our iterators so important parts will not miss anything.

So I do not see any bug right now. The flag abuse is another story and I
do agree we should use proper memcg specific synchronization here as
explained by Tejun in other email.

> The cgroup core can not reasonably provide a clear answer on whether
> the object around the css has been fully initialized, as that depends
> on controller-specific locking and lifetime rules.  Thus, introduce a
> memcg-specific flag that is set after the memcg has been initialized
> in css_online(), and read before mem_cgroup_iter() callers access the
> memcg members.
> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

With updated changelog
Acked-by: Michal Hocko <mhocko@suse.cz>

> Cc: <stable@vger.kernel.org>	[3.12+]

This is not necessary IMO

> ---
>  mm/memcontrol.c | 36 +++++++++++++++++++++++++++++++-----
>  1 file changed, 31 insertions(+), 5 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 306b6470784c..bafdac0f724e 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -292,6 +292,9 @@ struct mem_cgroup {
>  	/* vmpressure notifications */
>  	struct vmpressure vmpressure;
>  
> +	/* css_online() has been completed */
> +	bool initialized;
> +
>  	/*
>  	 * the counter to account for mem+swap usage.
>  	 */
> @@ -1090,10 +1093,23 @@ skip_node:
>  	 * skipping css reference should be safe.
>  	 */
>  	if (next_css) {
> -		if ((next_css == &root->css) ||
> -		    ((next_css->flags & CSS_ONLINE) &&
> -		     css_tryget_online(next_css)))
> -			return mem_cgroup_from_css(next_css);
> +		struct mem_cgroup *memcg = mem_cgroup_from_css(next_css);
> +
> +		if (next_css == &root->css)
> +			return memcg;
> +
> +		if (css_tryget_online(next_css)) {
> +			if (memcg->initialized) {
> +				/*
> +				 * Make sure the caller's accesses to
> +				 * the memcg members are issued after
> +				 * we see this flag set.
> +				 */
> +				smp_rmb();
> +				return memcg;
> +			}
> +			css_put(next_css);
> +		}
>  
>  		prev_css = next_css;
>  		goto skip_node;
> @@ -5413,6 +5429,7 @@ mem_cgroup_css_online(struct cgroup_subsys_state *css)
>  {
>  	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
>  	struct mem_cgroup *parent = mem_cgroup_from_css(css->parent);
> +	int ret;
>  
>  	if (css->id > MEM_CGROUP_ID_MAX)
>  		return -ENOSPC;
> @@ -5449,7 +5466,16 @@ mem_cgroup_css_online(struct cgroup_subsys_state *css)
>  	}
>  	mutex_unlock(&memcg_create_mutex);
>  
> -	return memcg_init_kmem(memcg, &memory_cgrp_subsys);
> +	ret = memcg_init_kmem(memcg, &memory_cgrp_subsys);
> +	if (ret)
> +		return ret;
> +
> +	/* Make sure the initialization is visible before the flag */
> +	smp_wmb();
> +
> +	memcg->initialized = true;
> +
> +	return 0;
>  }
>  
>  /*
> -- 
> 2.1.0
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch v2] mm: memcontrol: do not iterate uninitialized memcgs
@ 2014-09-25 11:43     ` Michal Hocko
  0 siblings, 0 replies; 23+ messages in thread
From: Michal Hocko @ 2014-09-25 11:43 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Hugh Dickins, Tejun Heo, linux-mm, cgroups, linux-kernel

On Wed 24-09-14 22:40:55, Johannes Weiner wrote:
> Argh, buggy css_put() against the root.  Hand grenades, everywhere.
> Update:
> 
> ---
> From 9b0b4d72d71cd8acd7aaa58d2006c751decc8739 Mon Sep 17 00:00:00 2001
> From: Johannes Weiner <hannes@cmpxchg.org>
> Date: Wed, 24 Sep 2014 22:00:20 -0400
> Subject: [patch] mm: memcontrol: do not iterate uninitialized memcgs
> 
> The cgroup iterators yield css objects that have not yet gone through
> css_online(), but they are not complete memcgs at this point and so
> the memcg iterators should not return them.  d8ad30559715 ("mm/memcg:
> iteration skip memcgs not yet fully initialized") set out to implement
> exactly this, but it uses CSS_ONLINE, a cgroup-internal flag that does
> not meet the ordering requirements for memcg, and so we still may see
> partially initialized memcgs from the iterators.

I do not see how would this happen. CSS_ONLINE is set after css_online
callback returns and mem_cgroup_css_online ends the core initialization
with mutex_unlock which should provide sufficient memory ordering
requirements (kmem is not covered but activate_kmem_mutex kmem.tcp by
proto_list_mutex). So the worst thing that might happen is that we miss
an already initialized memcg but that shouldn't matter because such a
memcg doesn't contain any tasks nor memory. memcg_has_children doesn't
rely on our iterators so important parts will not miss anything.

So I do not see any bug right now. The flag abuse is another story and I
do agree we should use proper memcg specific synchronization here as
explained by Tejun in other email.

> The cgroup core can not reasonably provide a clear answer on whether
> the object around the css has been fully initialized, as that depends
> on controller-specific locking and lifetime rules.  Thus, introduce a
> memcg-specific flag that is set after the memcg has been initialized
> in css_online(), and read before mem_cgroup_iter() callers access the
> memcg members.
> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

With updated changelog
Acked-by: Michal Hocko <mhocko@suse.cz>

> Cc: <stable@vger.kernel.org>	[3.12+]

This is not necessary IMO

> ---
>  mm/memcontrol.c | 36 +++++++++++++++++++++++++++++++-----
>  1 file changed, 31 insertions(+), 5 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 306b6470784c..bafdac0f724e 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -292,6 +292,9 @@ struct mem_cgroup {
>  	/* vmpressure notifications */
>  	struct vmpressure vmpressure;
>  
> +	/* css_online() has been completed */
> +	bool initialized;
> +
>  	/*
>  	 * the counter to account for mem+swap usage.
>  	 */
> @@ -1090,10 +1093,23 @@ skip_node:
>  	 * skipping css reference should be safe.
>  	 */
>  	if (next_css) {
> -		if ((next_css == &root->css) ||
> -		    ((next_css->flags & CSS_ONLINE) &&
> -		     css_tryget_online(next_css)))
> -			return mem_cgroup_from_css(next_css);
> +		struct mem_cgroup *memcg = mem_cgroup_from_css(next_css);
> +
> +		if (next_css == &root->css)
> +			return memcg;
> +
> +		if (css_tryget_online(next_css)) {
> +			if (memcg->initialized) {
> +				/*
> +				 * Make sure the caller's accesses to
> +				 * the memcg members are issued after
> +				 * we see this flag set.
> +				 */
> +				smp_rmb();
> +				return memcg;
> +			}
> +			css_put(next_css);
> +		}
>  
>  		prev_css = next_css;
>  		goto skip_node;
> @@ -5413,6 +5429,7 @@ mem_cgroup_css_online(struct cgroup_subsys_state *css)
>  {
>  	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
>  	struct mem_cgroup *parent = mem_cgroup_from_css(css->parent);
> +	int ret;
>  
>  	if (css->id > MEM_CGROUP_ID_MAX)
>  		return -ENOSPC;
> @@ -5449,7 +5466,16 @@ mem_cgroup_css_online(struct cgroup_subsys_state *css)
>  	}
>  	mutex_unlock(&memcg_create_mutex);
>  
> -	return memcg_init_kmem(memcg, &memory_cgrp_subsys);
> +	ret = memcg_init_kmem(memcg, &memory_cgrp_subsys);
> +	if (ret)
> +		return ret;
> +
> +	/* Make sure the initialization is visible before the flag */
> +	smp_wmb();
> +
> +	memcg->initialized = true;
> +
> +	return 0;
>  }
>  
>  /*
> -- 
> 2.1.0
> 

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch] mm: memcontrol: do not iterate uninitialized memcgs
  2014-09-25  2:57   ` Tejun Heo
  (?)
@ 2014-09-25 13:43     ` Johannes Weiner
  -1 siblings, 0 replies; 23+ messages in thread
From: Johannes Weiner @ 2014-09-25 13:43 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Andrew Morton, Hugh Dickins, Michal Hocko, Peter Zijlstra,
	linux-mm, cgroups, linux-kernel

On Wed, Sep 24, 2014 at 10:57:58PM -0400, Tejun Heo wrote:
> Hello,
> 
> On Wed, Sep 24, 2014 at 10:31:18PM -0400, Johannes Weiner wrote:
> ..
> > not meet the ordering requirements for memcg, and so we still may see
> > partially initialized memcgs from the iterators.
> 
> It's mainly the other way around - a fully initialized css may not
> show up in an iteration, but given that there's no memory ordering or
> synchronization around the flag, anything can happen.

Oh sure, I'm just more worried about leaking invalid memcgs rather
than temporarily skipping over a fully initialized one.  But I updated
the changelog to mention both possibilities.

> > +		if (next_css == &root->css ||
> > +		    css_tryget_online(next_css)) {
> > +			struct mem_cgroup *memcg;
> > +
> > +			memcg = mem_cgroup_from_css(next_css);
> > +			if (memcg->initialized) {
> > +				/*
> > +				 * Make sure the caller's accesses to
> > +				 * the memcg members are issued after
> > +				 * we see this flag set.
> 
> I usually prefer if the comment points to the exact location that the
> matching memory barriers live.  Sometimes it's difficult to locate the
> partner barrier even w/ the functional explanation.

That makes sense, updated.

> > +				 */
> > +				smp_rmb();
> > +				return memcg;
> 
> In an unlikely event this rmb becomes an issue, a self-pointing
> pointer which is set/read using smp_store_release() and
> smp_load_acquire() respectively can do with plain barrier() on the
> reader side on archs which don't need data dependency barrier
> (basically everything except alpha).  Not sure whether that'd be more
> or less readable than this tho.

So as far as I understand memory-barriers.txt we do not even need a
data dependency here to use store_release and load_acquire:

mem_cgroup_css_online():
<initialize memcg>
smp_store_release(&memcg->initialized, 1);

mem_cgroup_iter():
<look up maybe-initialized memcg>
if (smp_load_acquire(&memcg->initialized))
  return memcg;

So while I doubt that the smp_rmb() will become a problem in this
path, it would be neat to annotate the state flag around which we
synchronize like this, rather than have an anonymous barrier.

Peter, would you know if this is correct, or whether these primitives
actually do require a data dependency?

Thanks!

Updated patch:

---
>From 1cd659f42f399adc58522d478f54587c8c4dd5cc Mon Sep 17 00:00:00 2001
From: Johannes Weiner <hannes@cmpxchg.org>
Date: Wed, 24 Sep 2014 22:00:20 -0400
Subject: [patch] mm: memcontrol: do not iterate uninitialized memcgs

The cgroup iterators yield css objects that have not yet gone through
css_online(), but they are not complete memcgs at this point and so
the memcg iterators should not return them.  d8ad30559715 ("mm/memcg:
iteration skip memcgs not yet fully initialized") set out to implement
exactly this, but it uses CSS_ONLINE, a cgroup-internal flag that does
not meet the ordering requirements for memcg, and so the iterator may
skip over initialized groups, or return partially initialized memcgs.

The cgroup core can not reasonably provide a clear answer on whether
the object around the css has been fully initialized, as that depends
on controller-specific locking and lifetime rules.  Thus, introduce a
memcg-specific flag that is set after the memcg has been initialized
in css_online(), and read before mem_cgroup_iter() callers access the
memcg members.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: <stable@vger.kernel.org>	[3.12+]
---
 mm/memcontrol.c | 36 +++++++++++++++++++++++++++++++-----
 1 file changed, 31 insertions(+), 5 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 306b6470784c..23976fd885fd 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -292,6 +292,9 @@ struct mem_cgroup {
 	/* vmpressure notifications */
 	struct vmpressure vmpressure;
 
+	/* css_online() has been completed */
+	int initialized;
+
 	/*
 	 * the counter to account for mem+swap usage.
 	 */
@@ -1090,10 +1093,21 @@ skip_node:
 	 * skipping css reference should be safe.
 	 */
 	if (next_css) {
-		if ((next_css == &root->css) ||
-		    ((next_css->flags & CSS_ONLINE) &&
-		     css_tryget_online(next_css)))
-			return mem_cgroup_from_css(next_css);
+		struct mem_cgroup *memcg = mem_cgroup_from_css(next_css);
+
+		if (next_css == &root->css)
+			return memcg;
+
+		if (css_tryget_online(next_css)) {
+			/*
+			 * Make sure the memcg is initialized:
+			 * mem_cgroup_css_online() orders the the
+			 * initialization against setting the flag.
+			 */
+			if (smp_load_acquire(&memcg->initialized))
+				return memcg;
+			css_put(next_css);
+		}
 
 		prev_css = next_css;
 		goto skip_node;
@@ -5413,6 +5427,7 @@ mem_cgroup_css_online(struct cgroup_subsys_state *css)
 {
 	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
 	struct mem_cgroup *parent = mem_cgroup_from_css(css->parent);
+	int ret;
 
 	if (css->id > MEM_CGROUP_ID_MAX)
 		return -ENOSPC;
@@ -5449,7 +5464,18 @@ mem_cgroup_css_online(struct cgroup_subsys_state *css)
 	}
 	mutex_unlock(&memcg_create_mutex);
 
-	return memcg_init_kmem(memcg, &memory_cgrp_subsys);
+	ret = memcg_init_kmem(memcg, &memory_cgrp_subsys);
+	if (ret)
+		return ret;
+
+	/*
+	 * Make sure the memcg is initialized: mem_cgroup_iter()
+	 * orders reading memcg->initialized against its callers
+	 * reading the memcg members.
+	 */
+	smp_store_release(&memcg->initialized, 1);
+
+	return 0;
 }
 
 /*
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [patch] mm: memcontrol: do not iterate uninitialized memcgs
@ 2014-09-25 13:43     ` Johannes Weiner
  0 siblings, 0 replies; 23+ messages in thread
From: Johannes Weiner @ 2014-09-25 13:43 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Andrew Morton, Hugh Dickins, Michal Hocko, Peter Zijlstra,
	linux-mm, cgroups, linux-kernel

On Wed, Sep 24, 2014 at 10:57:58PM -0400, Tejun Heo wrote:
> Hello,
> 
> On Wed, Sep 24, 2014 at 10:31:18PM -0400, Johannes Weiner wrote:
> ..
> > not meet the ordering requirements for memcg, and so we still may see
> > partially initialized memcgs from the iterators.
> 
> It's mainly the other way around - a fully initialized css may not
> show up in an iteration, but given that there's no memory ordering or
> synchronization around the flag, anything can happen.

Oh sure, I'm just more worried about leaking invalid memcgs rather
than temporarily skipping over a fully initialized one.  But I updated
the changelog to mention both possibilities.

> > +		if (next_css == &root->css ||
> > +		    css_tryget_online(next_css)) {
> > +			struct mem_cgroup *memcg;
> > +
> > +			memcg = mem_cgroup_from_css(next_css);
> > +			if (memcg->initialized) {
> > +				/*
> > +				 * Make sure the caller's accesses to
> > +				 * the memcg members are issued after
> > +				 * we see this flag set.
> 
> I usually prefer if the comment points to the exact location that the
> matching memory barriers live.  Sometimes it's difficult to locate the
> partner barrier even w/ the functional explanation.

That makes sense, updated.

> > +				 */
> > +				smp_rmb();
> > +				return memcg;
> 
> In an unlikely event this rmb becomes an issue, a self-pointing
> pointer which is set/read using smp_store_release() and
> smp_load_acquire() respectively can do with plain barrier() on the
> reader side on archs which don't need data dependency barrier
> (basically everything except alpha).  Not sure whether that'd be more
> or less readable than this tho.

So as far as I understand memory-barriers.txt we do not even need a
data dependency here to use store_release and load_acquire:

mem_cgroup_css_online():
<initialize memcg>
smp_store_release(&memcg->initialized, 1);

mem_cgroup_iter():
<look up maybe-initialized memcg>
if (smp_load_acquire(&memcg->initialized))
  return memcg;

So while I doubt that the smp_rmb() will become a problem in this
path, it would be neat to annotate the state flag around which we
synchronize like this, rather than have an anonymous barrier.

Peter, would you know if this is correct, or whether these primitives
actually do require a data dependency?

Thanks!

Updated patch:

---

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch] mm: memcontrol: do not iterate uninitialized memcgs
@ 2014-09-25 13:43     ` Johannes Weiner
  0 siblings, 0 replies; 23+ messages in thread
From: Johannes Weiner @ 2014-09-25 13:43 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Andrew Morton, Hugh Dickins, Michal Hocko, Peter Zijlstra,
	linux-mm, cgroups, linux-kernel

On Wed, Sep 24, 2014 at 10:57:58PM -0400, Tejun Heo wrote:
> Hello,
> 
> On Wed, Sep 24, 2014 at 10:31:18PM -0400, Johannes Weiner wrote:
> ..
> > not meet the ordering requirements for memcg, and so we still may see
> > partially initialized memcgs from the iterators.
> 
> It's mainly the other way around - a fully initialized css may not
> show up in an iteration, but given that there's no memory ordering or
> synchronization around the flag, anything can happen.

Oh sure, I'm just more worried about leaking invalid memcgs rather
than temporarily skipping over a fully initialized one.  But I updated
the changelog to mention both possibilities.

> > +		if (next_css == &root->css ||
> > +		    css_tryget_online(next_css)) {
> > +			struct mem_cgroup *memcg;
> > +
> > +			memcg = mem_cgroup_from_css(next_css);
> > +			if (memcg->initialized) {
> > +				/*
> > +				 * Make sure the caller's accesses to
> > +				 * the memcg members are issued after
> > +				 * we see this flag set.
> 
> I usually prefer if the comment points to the exact location that the
> matching memory barriers live.  Sometimes it's difficult to locate the
> partner barrier even w/ the functional explanation.

That makes sense, updated.

> > +				 */
> > +				smp_rmb();
> > +				return memcg;
> 
> In an unlikely event this rmb becomes an issue, a self-pointing
> pointer which is set/read using smp_store_release() and
> smp_load_acquire() respectively can do with plain barrier() on the
> reader side on archs which don't need data dependency barrier
> (basically everything except alpha).  Not sure whether that'd be more
> or less readable than this tho.

So as far as I understand memory-barriers.txt we do not even need a
data dependency here to use store_release and load_acquire:

mem_cgroup_css_online():
<initialize memcg>
smp_store_release(&memcg->initialized, 1);

mem_cgroup_iter():
<look up maybe-initialized memcg>
if (smp_load_acquire(&memcg->initialized))
  return memcg;

So while I doubt that the smp_rmb() will become a problem in this
path, it would be neat to annotate the state flag around which we
synchronize like this, rather than have an anonymous barrier.

Peter, would you know if this is correct, or whether these primitives
actually do require a data dependency?

Thanks!

Updated patch:

---
From 1cd659f42f399adc58522d478f54587c8c4dd5cc Mon Sep 17 00:00:00 2001
From: Johannes Weiner <hannes@cmpxchg.org>
Date: Wed, 24 Sep 2014 22:00:20 -0400
Subject: [patch] mm: memcontrol: do not iterate uninitialized memcgs

The cgroup iterators yield css objects that have not yet gone through
css_online(), but they are not complete memcgs at this point and so
the memcg iterators should not return them.  d8ad30559715 ("mm/memcg:
iteration skip memcgs not yet fully initialized") set out to implement
exactly this, but it uses CSS_ONLINE, a cgroup-internal flag that does
not meet the ordering requirements for memcg, and so the iterator may
skip over initialized groups, or return partially initialized memcgs.

The cgroup core can not reasonably provide a clear answer on whether
the object around the css has been fully initialized, as that depends
on controller-specific locking and lifetime rules.  Thus, introduce a
memcg-specific flag that is set after the memcg has been initialized
in css_online(), and read before mem_cgroup_iter() callers access the
memcg members.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: <stable@vger.kernel.org>	[3.12+]
---
 mm/memcontrol.c | 36 +++++++++++++++++++++++++++++++-----
 1 file changed, 31 insertions(+), 5 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 306b6470784c..23976fd885fd 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -292,6 +292,9 @@ struct mem_cgroup {
 	/* vmpressure notifications */
 	struct vmpressure vmpressure;
 
+	/* css_online() has been completed */
+	int initialized;
+
 	/*
 	 * the counter to account for mem+swap usage.
 	 */
@@ -1090,10 +1093,21 @@ skip_node:
 	 * skipping css reference should be safe.
 	 */
 	if (next_css) {
-		if ((next_css == &root->css) ||
-		    ((next_css->flags & CSS_ONLINE) &&
-		     css_tryget_online(next_css)))
-			return mem_cgroup_from_css(next_css);
+		struct mem_cgroup *memcg = mem_cgroup_from_css(next_css);
+
+		if (next_css == &root->css)
+			return memcg;
+
+		if (css_tryget_online(next_css)) {
+			/*
+			 * Make sure the memcg is initialized:
+			 * mem_cgroup_css_online() orders the the
+			 * initialization against setting the flag.
+			 */
+			if (smp_load_acquire(&memcg->initialized))
+				return memcg;
+			css_put(next_css);
+		}
 
 		prev_css = next_css;
 		goto skip_node;
@@ -5413,6 +5427,7 @@ mem_cgroup_css_online(struct cgroup_subsys_state *css)
 {
 	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
 	struct mem_cgroup *parent = mem_cgroup_from_css(css->parent);
+	int ret;
 
 	if (css->id > MEM_CGROUP_ID_MAX)
 		return -ENOSPC;
@@ -5449,7 +5464,18 @@ mem_cgroup_css_online(struct cgroup_subsys_state *css)
 	}
 	mutex_unlock(&memcg_create_mutex);
 
-	return memcg_init_kmem(memcg, &memory_cgrp_subsys);
+	ret = memcg_init_kmem(memcg, &memory_cgrp_subsys);
+	if (ret)
+		return ret;
+
+	/*
+	 * Make sure the memcg is initialized: mem_cgroup_iter()
+	 * orders reading memcg->initialized against its callers
+	 * reading the memcg members.
+	 */
+	smp_store_release(&memcg->initialized, 1);
+
+	return 0;
 }
 
 /*
-- 
2.1.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [patch v2] mm: memcontrol: do not iterate uninitialized memcgs
  2014-09-25 11:43     ` Michal Hocko
  (?)
@ 2014-09-25 13:54       ` Johannes Weiner
  -1 siblings, 0 replies; 23+ messages in thread
From: Johannes Weiner @ 2014-09-25 13:54 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, Hugh Dickins, Tejun Heo, linux-mm, cgroups, linux-kernel

On Thu, Sep 25, 2014 at 01:43:39PM +0200, Michal Hocko wrote:
> On Wed 24-09-14 22:40:55, Johannes Weiner wrote:
> > Argh, buggy css_put() against the root.  Hand grenades, everywhere.
> > Update:
> > 
> > ---
> > From 9b0b4d72d71cd8acd7aaa58d2006c751decc8739 Mon Sep 17 00:00:00 2001
> > From: Johannes Weiner <hannes@cmpxchg.org>
> > Date: Wed, 24 Sep 2014 22:00:20 -0400
> > Subject: [patch] mm: memcontrol: do not iterate uninitialized memcgs
> > 
> > The cgroup iterators yield css objects that have not yet gone through
> > css_online(), but they are not complete memcgs at this point and so
> > the memcg iterators should not return them.  d8ad30559715 ("mm/memcg:
> > iteration skip memcgs not yet fully initialized") set out to implement
> > exactly this, but it uses CSS_ONLINE, a cgroup-internal flag that does
> > not meet the ordering requirements for memcg, and so we still may see
> > partially initialized memcgs from the iterators.
> 
> I do not see how would this happen. CSS_ONLINE is set after css_online
> callback returns and mem_cgroup_css_online ends the core initialization
> with mutex_unlock which should provide sufficient memory ordering
> requirements

But the iterators do not use the mutex?  We are missing the matching
acquire for the proper ordering.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch v2] mm: memcontrol: do not iterate uninitialized memcgs
@ 2014-09-25 13:54       ` Johannes Weiner
  0 siblings, 0 replies; 23+ messages in thread
From: Johannes Weiner @ 2014-09-25 13:54 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, Hugh Dickins, Tejun Heo, linux-mm, cgroups, linux-kernel

On Thu, Sep 25, 2014 at 01:43:39PM +0200, Michal Hocko wrote:
> On Wed 24-09-14 22:40:55, Johannes Weiner wrote:
> > Argh, buggy css_put() against the root.  Hand grenades, everywhere.
> > Update:
> > 
> > ---
> > From 9b0b4d72d71cd8acd7aaa58d2006c751decc8739 Mon Sep 17 00:00:00 2001
> > From: Johannes Weiner <hannes@cmpxchg.org>
> > Date: Wed, 24 Sep 2014 22:00:20 -0400
> > Subject: [patch] mm: memcontrol: do not iterate uninitialized memcgs
> > 
> > The cgroup iterators yield css objects that have not yet gone through
> > css_online(), but they are not complete memcgs at this point and so
> > the memcg iterators should not return them.  d8ad30559715 ("mm/memcg:
> > iteration skip memcgs not yet fully initialized") set out to implement
> > exactly this, but it uses CSS_ONLINE, a cgroup-internal flag that does
> > not meet the ordering requirements for memcg, and so we still may see
> > partially initialized memcgs from the iterators.
> 
> I do not see how would this happen. CSS_ONLINE is set after css_online
> callback returns and mem_cgroup_css_online ends the core initialization
> with mutex_unlock which should provide sufficient memory ordering
> requirements

But the iterators do not use the mutex?  We are missing the matching
acquire for the proper ordering.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch v2] mm: memcontrol: do not iterate uninitialized memcgs
@ 2014-09-25 13:54       ` Johannes Weiner
  0 siblings, 0 replies; 23+ messages in thread
From: Johannes Weiner @ 2014-09-25 13:54 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, Hugh Dickins, Tejun Heo,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Thu, Sep 25, 2014 at 01:43:39PM +0200, Michal Hocko wrote:
> On Wed 24-09-14 22:40:55, Johannes Weiner wrote:
> > Argh, buggy css_put() against the root.  Hand grenades, everywhere.
> > Update:
> > 
> > ---
> > From 9b0b4d72d71cd8acd7aaa58d2006c751decc8739 Mon Sep 17 00:00:00 2001
> > From: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
> > Date: Wed, 24 Sep 2014 22:00:20 -0400
> > Subject: [patch] mm: memcontrol: do not iterate uninitialized memcgs
> > 
> > The cgroup iterators yield css objects that have not yet gone through
> > css_online(), but they are not complete memcgs at this point and so
> > the memcg iterators should not return them.  d8ad30559715 ("mm/memcg:
> > iteration skip memcgs not yet fully initialized") set out to implement
> > exactly this, but it uses CSS_ONLINE, a cgroup-internal flag that does
> > not meet the ordering requirements for memcg, and so we still may see
> > partially initialized memcgs from the iterators.
> 
> I do not see how would this happen. CSS_ONLINE is set after css_online
> callback returns and mem_cgroup_css_online ends the core initialization
> with mutex_unlock which should provide sufficient memory ordering
> requirements

But the iterators do not use the mutex?  We are missing the matching
acquire for the proper ordering.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch v2] mm: memcontrol: do not iterate uninitialized memcgs
  2014-09-25 13:54       ` Johannes Weiner
  (?)
@ 2014-09-25 14:11         ` Michal Hocko
  -1 siblings, 0 replies; 23+ messages in thread
From: Michal Hocko @ 2014-09-25 14:11 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Hugh Dickins, Tejun Heo, linux-mm, cgroups, linux-kernel

On Thu 25-09-14 09:54:50, Johannes Weiner wrote:
> On Thu, Sep 25, 2014 at 01:43:39PM +0200, Michal Hocko wrote:
> > On Wed 24-09-14 22:40:55, Johannes Weiner wrote:
> > > Argh, buggy css_put() against the root.  Hand grenades, everywhere.
> > > Update:
> > > 
> > > ---
> > > From 9b0b4d72d71cd8acd7aaa58d2006c751decc8739 Mon Sep 17 00:00:00 2001
> > > From: Johannes Weiner <hannes@cmpxchg.org>
> > > Date: Wed, 24 Sep 2014 22:00:20 -0400
> > > Subject: [patch] mm: memcontrol: do not iterate uninitialized memcgs
> > > 
> > > The cgroup iterators yield css objects that have not yet gone through
> > > css_online(), but they are not complete memcgs at this point and so
> > > the memcg iterators should not return them.  d8ad30559715 ("mm/memcg:
> > > iteration skip memcgs not yet fully initialized") set out to implement
> > > exactly this, but it uses CSS_ONLINE, a cgroup-internal flag that does
> > > not meet the ordering requirements for memcg, and so we still may see
> > > partially initialized memcgs from the iterators.
> > 
> > I do not see how would this happen. CSS_ONLINE is set after css_online
> > callback returns and mem_cgroup_css_online ends the core initialization
> > with mutex_unlock which should provide sufficient memory ordering
> > requirements
> 
> But the iterators do not use the mutex?  We are missing the matching
> acquire for the proper ordering.

OK, I guess you are right. Besides that I am not sure what are the
ordering guarantees of mutex now that I am looking into the code.

Anyway it is definitely better to be explicit about barriers.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch v2] mm: memcontrol: do not iterate uninitialized memcgs
@ 2014-09-25 14:11         ` Michal Hocko
  0 siblings, 0 replies; 23+ messages in thread
From: Michal Hocko @ 2014-09-25 14:11 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Hugh Dickins, Tejun Heo, linux-mm, cgroups, linux-kernel

On Thu 25-09-14 09:54:50, Johannes Weiner wrote:
> On Thu, Sep 25, 2014 at 01:43:39PM +0200, Michal Hocko wrote:
> > On Wed 24-09-14 22:40:55, Johannes Weiner wrote:
> > > Argh, buggy css_put() against the root.  Hand grenades, everywhere.
> > > Update:
> > > 
> > > ---
> > > From 9b0b4d72d71cd8acd7aaa58d2006c751decc8739 Mon Sep 17 00:00:00 2001
> > > From: Johannes Weiner <hannes@cmpxchg.org>
> > > Date: Wed, 24 Sep 2014 22:00:20 -0400
> > > Subject: [patch] mm: memcontrol: do not iterate uninitialized memcgs
> > > 
> > > The cgroup iterators yield css objects that have not yet gone through
> > > css_online(), but they are not complete memcgs at this point and so
> > > the memcg iterators should not return them.  d8ad30559715 ("mm/memcg:
> > > iteration skip memcgs not yet fully initialized") set out to implement
> > > exactly this, but it uses CSS_ONLINE, a cgroup-internal flag that does
> > > not meet the ordering requirements for memcg, and so we still may see
> > > partially initialized memcgs from the iterators.
> > 
> > I do not see how would this happen. CSS_ONLINE is set after css_online
> > callback returns and mem_cgroup_css_online ends the core initialization
> > with mutex_unlock which should provide sufficient memory ordering
> > requirements
> 
> But the iterators do not use the mutex?  We are missing the matching
> acquire for the proper ordering.

OK, I guess you are right. Besides that I am not sure what are the
ordering guarantees of mutex now that I am looking into the code.

Anyway it is definitely better to be explicit about barriers.

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch v2] mm: memcontrol: do not iterate uninitialized memcgs
@ 2014-09-25 14:11         ` Michal Hocko
  0 siblings, 0 replies; 23+ messages in thread
From: Michal Hocko @ 2014-09-25 14:11 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Hugh Dickins, Tejun Heo,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Thu 25-09-14 09:54:50, Johannes Weiner wrote:
> On Thu, Sep 25, 2014 at 01:43:39PM +0200, Michal Hocko wrote:
> > On Wed 24-09-14 22:40:55, Johannes Weiner wrote:
> > > Argh, buggy css_put() against the root.  Hand grenades, everywhere.
> > > Update:
> > > 
> > > ---
> > > From 9b0b4d72d71cd8acd7aaa58d2006c751decc8739 Mon Sep 17 00:00:00 2001
> > > From: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
> > > Date: Wed, 24 Sep 2014 22:00:20 -0400
> > > Subject: [patch] mm: memcontrol: do not iterate uninitialized memcgs
> > > 
> > > The cgroup iterators yield css objects that have not yet gone through
> > > css_online(), but they are not complete memcgs at this point and so
> > > the memcg iterators should not return them.  d8ad30559715 ("mm/memcg:
> > > iteration skip memcgs not yet fully initialized") set out to implement
> > > exactly this, but it uses CSS_ONLINE, a cgroup-internal flag that does
> > > not meet the ordering requirements for memcg, and so we still may see
> > > partially initialized memcgs from the iterators.
> > 
> > I do not see how would this happen. CSS_ONLINE is set after css_online
> > callback returns and mem_cgroup_css_online ends the core initialization
> > with mutex_unlock which should provide sufficient memory ordering
> > requirements
> 
> But the iterators do not use the mutex?  We are missing the matching
> acquire for the proper ordering.

OK, I guess you are right. Besides that I am not sure what are the
ordering guarantees of mutex now that I am looking into the code.

Anyway it is definitely better to be explicit about barriers.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch] mm: memcontrol: do not iterate uninitialized memcgs
  2014-09-25 13:43     ` Johannes Weiner
@ 2014-09-25 14:23       ` Michal Hocko
  -1 siblings, 0 replies; 23+ messages in thread
From: Michal Hocko @ 2014-09-25 14:23 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Tejun Heo, Andrew Morton, Hugh Dickins, Peter Zijlstra, linux-mm,
	cgroups, linux-kernel

On Thu 25-09-14 09:43:42, Johannes Weiner wrote:
[...]
> From 1cd659f42f399adc58522d478f54587c8c4dd5cc Mon Sep 17 00:00:00 2001
> From: Johannes Weiner <hannes@cmpxchg.org>
> Date: Wed, 24 Sep 2014 22:00:20 -0400
> Subject: [patch] mm: memcontrol: do not iterate uninitialized memcgs
> 
> The cgroup iterators yield css objects that have not yet gone through
> css_online(), but they are not complete memcgs at this point and so
> the memcg iterators should not return them.  d8ad30559715 ("mm/memcg:
> iteration skip memcgs not yet fully initialized") set out to implement
> exactly this, but it uses CSS_ONLINE, a cgroup-internal flag that does
> not meet the ordering requirements for memcg, and so the iterator may
> skip over initialized groups, or return partially initialized memcgs.
> 
> The cgroup core can not reasonably provide a clear answer on whether
> the object around the css has been fully initialized, as that depends
> on controller-specific locking and lifetime rules.  Thus, introduce a
> memcg-specific flag that is set after the memcg has been initialized
> in css_online(), and read before mem_cgroup_iter() callers access the
> memcg members.
> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> Cc: <stable@vger.kernel.org>	[3.12+]

I am not an expert (obviously) on memory barriers but from
Documentation/memory-barriers.txt, my understanding is that
smp_load_acquire and smp_store_release is exactly what we need here.
"
However, after an ACQUIRE on a given variable, all memory accesses
preceding any prior RELEASE on that same variable are guaranteed to be
visible.
"

Acked-by: Michal Hocko <mhocko@suse.cz>

Stable backport would be trickier because ACQUIRE/RELEASE were
introduced later but smp_mb() should be safe replacement.

Thanks!

> ---
>  mm/memcontrol.c | 36 +++++++++++++++++++++++++++++++-----
>  1 file changed, 31 insertions(+), 5 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 306b6470784c..23976fd885fd 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -292,6 +292,9 @@ struct mem_cgroup {
>  	/* vmpressure notifications */
>  	struct vmpressure vmpressure;
>  
> +	/* css_online() has been completed */
> +	int initialized;
> +
>  	/*
>  	 * the counter to account for mem+swap usage.
>  	 */
> @@ -1090,10 +1093,21 @@ skip_node:
>  	 * skipping css reference should be safe.
>  	 */
>  	if (next_css) {
> -		if ((next_css == &root->css) ||
> -		    ((next_css->flags & CSS_ONLINE) &&
> -		     css_tryget_online(next_css)))
> -			return mem_cgroup_from_css(next_css);
> +		struct mem_cgroup *memcg = mem_cgroup_from_css(next_css);
> +
> +		if (next_css == &root->css)
> +			return memcg;
> +
> +		if (css_tryget_online(next_css)) {
> +			/*
> +			 * Make sure the memcg is initialized:
> +			 * mem_cgroup_css_online() orders the the
> +			 * initialization against setting the flag.
> +			 */
> +			if (smp_load_acquire(&memcg->initialized))
> +				return memcg;
> +			css_put(next_css);
> +		}
>  
>  		prev_css = next_css;
>  		goto skip_node;
> @@ -5413,6 +5427,7 @@ mem_cgroup_css_online(struct cgroup_subsys_state *css)
>  {
>  	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
>  	struct mem_cgroup *parent = mem_cgroup_from_css(css->parent);
> +	int ret;
>  
>  	if (css->id > MEM_CGROUP_ID_MAX)
>  		return -ENOSPC;
> @@ -5449,7 +5464,18 @@ mem_cgroup_css_online(struct cgroup_subsys_state *css)
>  	}
>  	mutex_unlock(&memcg_create_mutex);
>  
> -	return memcg_init_kmem(memcg, &memory_cgrp_subsys);
> +	ret = memcg_init_kmem(memcg, &memory_cgrp_subsys);
> +	if (ret)
> +		return ret;
> +
> +	/*
> +	 * Make sure the memcg is initialized: mem_cgroup_iter()
> +	 * orders reading memcg->initialized against its callers
> +	 * reading the memcg members.
> +	 */
> +	smp_store_release(&memcg->initialized, 1);
> +
> +	return 0;
>  }
>  
>  /*
> -- 
> 2.1.0
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch] mm: memcontrol: do not iterate uninitialized memcgs
@ 2014-09-25 14:23       ` Michal Hocko
  0 siblings, 0 replies; 23+ messages in thread
From: Michal Hocko @ 2014-09-25 14:23 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Tejun Heo, Andrew Morton, Hugh Dickins, Peter Zijlstra, linux-mm,
	cgroups, linux-kernel

On Thu 25-09-14 09:43:42, Johannes Weiner wrote:
[...]
> From 1cd659f42f399adc58522d478f54587c8c4dd5cc Mon Sep 17 00:00:00 2001
> From: Johannes Weiner <hannes@cmpxchg.org>
> Date: Wed, 24 Sep 2014 22:00:20 -0400
> Subject: [patch] mm: memcontrol: do not iterate uninitialized memcgs
> 
> The cgroup iterators yield css objects that have not yet gone through
> css_online(), but they are not complete memcgs at this point and so
> the memcg iterators should not return them.  d8ad30559715 ("mm/memcg:
> iteration skip memcgs not yet fully initialized") set out to implement
> exactly this, but it uses CSS_ONLINE, a cgroup-internal flag that does
> not meet the ordering requirements for memcg, and so the iterator may
> skip over initialized groups, or return partially initialized memcgs.
> 
> The cgroup core can not reasonably provide a clear answer on whether
> the object around the css has been fully initialized, as that depends
> on controller-specific locking and lifetime rules.  Thus, introduce a
> memcg-specific flag that is set after the memcg has been initialized
> in css_online(), and read before mem_cgroup_iter() callers access the
> memcg members.
> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> Cc: <stable@vger.kernel.org>	[3.12+]

I am not an expert (obviously) on memory barriers but from
Documentation/memory-barriers.txt, my understanding is that
smp_load_acquire and smp_store_release is exactly what we need here.
"
However, after an ACQUIRE on a given variable, all memory accesses
preceding any prior RELEASE on that same variable are guaranteed to be
visible.
"

Acked-by: Michal Hocko <mhocko@suse.cz>

Stable backport would be trickier because ACQUIRE/RELEASE were
introduced later but smp_mb() should be safe replacement.

Thanks!

> ---
>  mm/memcontrol.c | 36 +++++++++++++++++++++++++++++++-----
>  1 file changed, 31 insertions(+), 5 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 306b6470784c..23976fd885fd 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -292,6 +292,9 @@ struct mem_cgroup {
>  	/* vmpressure notifications */
>  	struct vmpressure vmpressure;
>  
> +	/* css_online() has been completed */
> +	int initialized;
> +
>  	/*
>  	 * the counter to account for mem+swap usage.
>  	 */
> @@ -1090,10 +1093,21 @@ skip_node:
>  	 * skipping css reference should be safe.
>  	 */
>  	if (next_css) {
> -		if ((next_css == &root->css) ||
> -		    ((next_css->flags & CSS_ONLINE) &&
> -		     css_tryget_online(next_css)))
> -			return mem_cgroup_from_css(next_css);
> +		struct mem_cgroup *memcg = mem_cgroup_from_css(next_css);
> +
> +		if (next_css == &root->css)
> +			return memcg;
> +
> +		if (css_tryget_online(next_css)) {
> +			/*
> +			 * Make sure the memcg is initialized:
> +			 * mem_cgroup_css_online() orders the the
> +			 * initialization against setting the flag.
> +			 */
> +			if (smp_load_acquire(&memcg->initialized))
> +				return memcg;
> +			css_put(next_css);
> +		}
>  
>  		prev_css = next_css;
>  		goto skip_node;
> @@ -5413,6 +5427,7 @@ mem_cgroup_css_online(struct cgroup_subsys_state *css)
>  {
>  	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
>  	struct mem_cgroup *parent = mem_cgroup_from_css(css->parent);
> +	int ret;
>  
>  	if (css->id > MEM_CGROUP_ID_MAX)
>  		return -ENOSPC;
> @@ -5449,7 +5464,18 @@ mem_cgroup_css_online(struct cgroup_subsys_state *css)
>  	}
>  	mutex_unlock(&memcg_create_mutex);
>  
> -	return memcg_init_kmem(memcg, &memory_cgrp_subsys);
> +	ret = memcg_init_kmem(memcg, &memory_cgrp_subsys);
> +	if (ret)
> +		return ret;
> +
> +	/*
> +	 * Make sure the memcg is initialized: mem_cgroup_iter()
> +	 * orders reading memcg->initialized against its callers
> +	 * reading the memcg members.
> +	 */
> +	smp_store_release(&memcg->initialized, 1);
> +
> +	return 0;
>  }
>  
>  /*
> -- 
> 2.1.0
> 

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch] mm: memcontrol: do not iterate uninitialized memcgs
  2014-09-25 13:43     ` Johannes Weiner
@ 2014-09-26 13:39       ` Peter Zijlstra
  -1 siblings, 0 replies; 23+ messages in thread
From: Peter Zijlstra @ 2014-09-26 13:39 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Tejun Heo, Andrew Morton, Hugh Dickins, Michal Hocko, linux-mm,
	cgroups, linux-kernel

On Thu, Sep 25, 2014 at 09:43:42AM -0400, Johannes Weiner wrote:
> > > +		if (next_css == &root->css ||
> > > +		    css_tryget_online(next_css)) {
> > > +			struct mem_cgroup *memcg;
> > > +
> > > +			memcg = mem_cgroup_from_css(next_css);
> > > +			if (memcg->initialized) {
> > > +				/*
> > > +				 * Make sure the caller's accesses to
> > > +				 * the memcg members are issued after
> > > +				 * we see this flag set.
> > 
> > I usually prefer if the comment points to the exact location that the
> > matching memory barriers live.  Sometimes it's difficult to locate the
> > partner barrier even w/ the functional explanation.

That is indeed good practise! :-)

> > > +				 */
> > > +				smp_rmb();
> > > +				return memcg;
> > 
> > In an unlikely event this rmb becomes an issue, a self-pointing
> > pointer which is set/read using smp_store_release() and
> > smp_load_acquire() respectively can do with plain barrier() on the
> > reader side on archs which don't need data dependency barrier
> > (basically everything except alpha).  Not sure whether that'd be more
> > or less readable than this tho.

> So as far as I understand memory-barriers.txt we do not even need a
> data dependency here to use store_release and load_acquire:
> 
> mem_cgroup_css_online():
> <initialize memcg>
> smp_store_release(&memcg->initialized, 1);
> 
> mem_cgroup_iter():
> <look up maybe-initialized memcg>
> if (smp_load_acquire(&memcg->initialized))
>   return memcg;
> 
> So while I doubt that the smp_rmb() will become a problem in this
> path, it would be neat to annotate the state flag around which we
> synchronize like this, rather than have an anonymous barrier.
> 
> Peter, would you know if this is correct, or whether these primitives
> actually do require a data dependency?

I'm fairly sure you do not. load_acquire() has the same barrier in on
Alpha that read_barrier_depends() does, and that's the only arch that
matters.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch] mm: memcontrol: do not iterate uninitialized memcgs
@ 2014-09-26 13:39       ` Peter Zijlstra
  0 siblings, 0 replies; 23+ messages in thread
From: Peter Zijlstra @ 2014-09-26 13:39 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Tejun Heo, Andrew Morton, Hugh Dickins, Michal Hocko, linux-mm,
	cgroups, linux-kernel

On Thu, Sep 25, 2014 at 09:43:42AM -0400, Johannes Weiner wrote:
> > > +		if (next_css == &root->css ||
> > > +		    css_tryget_online(next_css)) {
> > > +			struct mem_cgroup *memcg;
> > > +
> > > +			memcg = mem_cgroup_from_css(next_css);
> > > +			if (memcg->initialized) {
> > > +				/*
> > > +				 * Make sure the caller's accesses to
> > > +				 * the memcg members are issued after
> > > +				 * we see this flag set.
> > 
> > I usually prefer if the comment points to the exact location that the
> > matching memory barriers live.  Sometimes it's difficult to locate the
> > partner barrier even w/ the functional explanation.

That is indeed good practise! :-)

> > > +				 */
> > > +				smp_rmb();
> > > +				return memcg;
> > 
> > In an unlikely event this rmb becomes an issue, a self-pointing
> > pointer which is set/read using smp_store_release() and
> > smp_load_acquire() respectively can do with plain barrier() on the
> > reader side on archs which don't need data dependency barrier
> > (basically everything except alpha).  Not sure whether that'd be more
> > or less readable than this tho.

> So as far as I understand memory-barriers.txt we do not even need a
> data dependency here to use store_release and load_acquire:
> 
> mem_cgroup_css_online():
> <initialize memcg>
> smp_store_release(&memcg->initialized, 1);
> 
> mem_cgroup_iter():
> <look up maybe-initialized memcg>
> if (smp_load_acquire(&memcg->initialized))
>   return memcg;
> 
> So while I doubt that the smp_rmb() will become a problem in this
> path, it would be neat to annotate the state flag around which we
> synchronize like this, rather than have an anonymous barrier.
> 
> Peter, would you know if this is correct, or whether these primitives
> actually do require a data dependency?

I'm fairly sure you do not. load_acquire() has the same barrier in on
Alpha that read_barrier_depends() does, and that's the only arch that
matters.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2014-09-26 13:39 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-25  2:31 [patch] mm: memcontrol: do not iterate uninitialized memcgs Johannes Weiner
2014-09-25  2:31 ` Johannes Weiner
2014-09-25  2:31 ` Johannes Weiner
2014-09-25  2:40 ` [patch v2] " Johannes Weiner
2014-09-25  2:40   ` Johannes Weiner
2014-09-25  2:40   ` Johannes Weiner
2014-09-25 11:43   ` Michal Hocko
2014-09-25 11:43     ` Michal Hocko
2014-09-25 13:54     ` Johannes Weiner
2014-09-25 13:54       ` Johannes Weiner
2014-09-25 13:54       ` Johannes Weiner
2014-09-25 14:11       ` Michal Hocko
2014-09-25 14:11         ` Michal Hocko
2014-09-25 14:11         ` Michal Hocko
2014-09-25  2:57 ` [patch] " Tejun Heo
2014-09-25  2:57   ` Tejun Heo
2014-09-25 13:43   ` Johannes Weiner
2014-09-25 13:43     ` Johannes Weiner
2014-09-25 13:43     ` Johannes Weiner
2014-09-25 14:23     ` Michal Hocko
2014-09-25 14:23       ` Michal Hocko
2014-09-26 13:39     ` Peter Zijlstra
2014-09-26 13:39       ` Peter Zijlstra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.