linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] memcg: restore ss->id_lock to spinlock, using RCU for next
@ 2012-01-19  6:05 Hugh Dickins
  2012-01-19  6:58 ` KAMEZAWA Hiroyuki
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Hugh Dickins @ 2012-01-19  6:05 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Li Zefan, Andrew Morton, Manfred Spraul, KAMEZAWA Hiroyuki,
	Johannes Weiner, Ying Han, Greg Thelen, cgroups, linux-mm,
	linux-kernel

Commit c1e2ee2dc436 "memcg: replace ss->id_lock with a rwlock" has
now been seen to cause the unfair behavior we should have expected
from converting a spinlock to an rwlock: softlockup in cgroup_mkdir(),
whose get_new_cssid() is waiting for the wlock, while there are 19
tasks using the rlock in css_get_next() to get on with their memcg
workload (in an artificial test, admittedly).  Yet lib/idr.c was
made suitable for RCU way back.

1. Revert that commit, restoring ss->id_lock to a spinlock.

2. Make one small adjustment to idr_get_next(): take the height from
the top layer (stable under RCU) instead of from the root (unprotected
by RCU), as idr_find() does.

3. Remove lock and unlock around css_get_next()'s call to idr_get_next():
memcg iterators (only users of css_get_next) already did rcu_read_lock(),
and comment demands that, but add a WARN_ON_ONCE to make sure of it.

Signed-off-by: Hugh Dickins <hughd@google.com>
---

 include/linux/cgroup.h |    2 +-
 kernel/cgroup.c        |   19 +++++++++----------
 lib/idr.c              |    4 ++--
 3 files changed, 12 insertions(+), 13 deletions(-)

--- 3.2.0+/include/linux/cgroup.h	2012-01-14 13:01:57.532007832 -0800
+++ linux/include/linux/cgroup.h	2012-01-18 21:21:45.695966602 -0800
@@ -535,7 +535,7 @@ struct cgroup_subsys {
 	struct list_head sibling;
 	/* used when use_id == true */
 	struct idr idr;
-	rwlock_t id_lock;
+	spinlock_t id_lock;
 
 	/* should be defined only by modular subsystems */
 	struct module *module;
--- 3.2.0+/kernel/cgroup.c	2012-01-14 13:01:57.824007839 -0800
+++ linux/kernel/cgroup.c	2012-01-18 21:29:05.199958492 -0800
@@ -4939,9 +4939,9 @@ void free_css_id(struct cgroup_subsys *s
 
 	rcu_assign_pointer(id->css, NULL);
 	rcu_assign_pointer(css->id, NULL);
-	write_lock(&ss->id_lock);
+	spin_lock(&ss->id_lock);
 	idr_remove(&ss->idr, id->id);
-	write_unlock(&ss->id_lock);
+	spin_unlock(&ss->id_lock);
 	kfree_rcu(id, rcu_head);
 }
 EXPORT_SYMBOL_GPL(free_css_id);
@@ -4967,10 +4967,10 @@ static struct css_id *get_new_cssid(stru
 		error = -ENOMEM;
 		goto err_out;
 	}
-	write_lock(&ss->id_lock);
+	spin_lock(&ss->id_lock);
 	/* Don't use 0. allocates an ID of 1-65535 */
 	error = idr_get_new_above(&ss->idr, newid, 1, &myid);
-	write_unlock(&ss->id_lock);
+	spin_unlock(&ss->id_lock);
 
 	/* Returns error when there are no free spaces for new ID.*/
 	if (error) {
@@ -4985,9 +4985,9 @@ static struct css_id *get_new_cssid(stru
 	return newid;
 remove_idr:
 	error = -ENOSPC;
-	write_lock(&ss->id_lock);
+	spin_lock(&ss->id_lock);
 	idr_remove(&ss->idr, myid);
-	write_unlock(&ss->id_lock);
+	spin_unlock(&ss->id_lock);
 err_out:
 	kfree(newid);
 	return ERR_PTR(error);
@@ -4999,7 +4999,7 @@ static int __init_or_module cgroup_init_
 {
 	struct css_id *newid;
 
-	rwlock_init(&ss->id_lock);
+	spin_lock_init(&ss->id_lock);
 	idr_init(&ss->idr);
 
 	newid = get_new_cssid(ss, 0);
@@ -5087,6 +5087,8 @@ css_get_next(struct cgroup_subsys *ss, i
 		return NULL;
 
 	BUG_ON(!ss->use_id);
+	WARN_ON_ONCE(!rcu_read_lock_held());
+
 	/* fill start point for scan */
 	tmpid = id;
 	while (1) {
@@ -5094,10 +5096,7 @@ css_get_next(struct cgroup_subsys *ss, i
 		 * scan next entry from bitmap(tree), tmpid is updated after
 		 * idr_get_next().
 		 */
-		read_lock(&ss->id_lock);
 		tmp = idr_get_next(&ss->idr, &tmpid);
-		read_unlock(&ss->id_lock);
-
 		if (!tmp)
 			break;
 		if (tmp->depth >= depth && tmp->stack[depth] == rootid) {
--- 3.2.0+/lib/idr.c	2012-01-04 15:55:44.000000000 -0800
+++ linux/lib/idr.c	2012-01-18 21:25:36.947963342 -0800
@@ -605,11 +605,11 @@ void *idr_get_next(struct idr *idp, int
 	int n, max;
 
 	/* find first ent */
-	n = idp->layers * IDR_BITS;
-	max = 1 << n;
 	p = rcu_dereference_raw(idp->top);
 	if (!p)
 		return NULL;
+	n = (p->layer + 1) * IDR_BITS;
+	max = 1 << n;
 
 	while (id < max) {
 		while (n > 0 && p) {

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] memcg: restore ss->id_lock to spinlock, using RCU for next
  2012-01-19  6:05 [PATCH] memcg: restore ss->id_lock to spinlock, using RCU for next Hugh Dickins
@ 2012-01-19  6:58 ` KAMEZAWA Hiroyuki
  2012-01-19  7:31 ` Li Zefan
  2012-01-19  7:33 ` Eric Dumazet
  2 siblings, 0 replies; 14+ messages in thread
From: KAMEZAWA Hiroyuki @ 2012-01-19  6:58 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Tejun Heo, Li Zefan, Andrew Morton, Manfred Spraul,
	Johannes Weiner, Ying Han, Greg Thelen, cgroups, linux-mm,
	linux-kernel

On Wed, 18 Jan 2012 22:05:12 -0800 (PST)
Hugh Dickins <hughd@google.com> wrote:

> Commit c1e2ee2dc436 "memcg: replace ss->id_lock with a rwlock" has
> now been seen to cause the unfair behavior we should have expected
> from converting a spinlock to an rwlock: softlockup in cgroup_mkdir(),
> whose get_new_cssid() is waiting for the wlock, while there are 19
> tasks using the rlock in css_get_next() to get on with their memcg
> workload (in an artificial test, admittedly).  Yet lib/idr.c was
> made suitable for RCU way back.
> 
> 1. Revert that commit, restoring ss->id_lock to a spinlock.
> 
> 2. Make one small adjustment to idr_get_next(): take the height from
> the top layer (stable under RCU) instead of from the root (unprotected
> by RCU), as idr_find() does.
> 
> 3. Remove lock and unlock around css_get_next()'s call to idr_get_next():
> memcg iterators (only users of css_get_next) already did rcu_read_lock(),
> and comment demands that, but add a WARN_ON_ONCE to make sure of it.
> 
> Signed-off-by: Hugh Dickins <hughd@google.com>

Thank you ! This seems much better.

Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] memcg: restore ss->id_lock to spinlock, using RCU for next
  2012-01-19  6:05 [PATCH] memcg: restore ss->id_lock to spinlock, using RCU for next Hugh Dickins
  2012-01-19  6:58 ` KAMEZAWA Hiroyuki
@ 2012-01-19  7:31 ` Li Zefan
  2012-01-19  7:33 ` Eric Dumazet
  2 siblings, 0 replies; 14+ messages in thread
From: Li Zefan @ 2012-01-19  7:31 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Tejun Heo, Andrew Morton, Manfred Spraul, KAMEZAWA Hiroyuki,
	Johannes Weiner, Ying Han, Greg Thelen, cgroups, linux-mm,
	linux-kernel

Hugh Dickins wrote:
> Commit c1e2ee2dc436 "memcg: replace ss->id_lock with a rwlock" has
> now been seen to cause the unfair behavior we should have expected
> from converting a spinlock to an rwlock: softlockup in cgroup_mkdir(),
> whose get_new_cssid() is waiting for the wlock, while there are 19
> tasks using the rlock in css_get_next() to get on with their memcg
> workload (in an artificial test, admittedly).  Yet lib/idr.c was
> made suitable for RCU way back.
> 
> 1. Revert that commit, restoring ss->id_lock to a spinlock.
> 
> 2. Make one small adjustment to idr_get_next(): take the height from
> the top layer (stable under RCU) instead of from the root (unprotected
> by RCU), as idr_find() does.
> 
> 3. Remove lock and unlock around css_get_next()'s call to idr_get_next():
> memcg iterators (only users of css_get_next) already did rcu_read_lock(),
> and comment demands that, but add a WARN_ON_ONCE to make sure of it.
> 
> Signed-off-by: Hugh Dickins <hughd@google.com>

Acked-by: Li Zefan <lizf@cn.fujitsu.com>

> ---
> 
>  include/linux/cgroup.h |    2 +-
>  kernel/cgroup.c        |   19 +++++++++----------
>  lib/idr.c              |    4 ++--
>  3 files changed, 12 insertions(+), 13 deletions(-)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] memcg: restore ss->id_lock to spinlock, using RCU for next
  2012-01-19  6:05 [PATCH] memcg: restore ss->id_lock to spinlock, using RCU for next Hugh Dickins
  2012-01-19  6:58 ` KAMEZAWA Hiroyuki
  2012-01-19  7:31 ` Li Zefan
@ 2012-01-19  7:33 ` Eric Dumazet
  2012-01-19 12:28   ` Tejun Heo
  2 siblings, 1 reply; 14+ messages in thread
From: Eric Dumazet @ 2012-01-19  7:33 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Tejun Heo, Li Zefan, Andrew Morton, Manfred Spraul,
	KAMEZAWA Hiroyuki, Johannes Weiner, Ying Han, Greg Thelen,
	cgroups, linux-mm, linux-kernel

Le mercredi 18 janvier 2012 à 22:05 -0800, Hugh Dickins a écrit :

> 2. Make one small adjustment to idr_get_next(): take the height from
> the top layer (stable under RCU) instead of from the root (unprotected
> by RCU), as idr_find() does.
> 

> --- 3.2.0+/lib/idr.c	2012-01-04 15:55:44.000000000 -0800
> +++ linux/lib/idr.c	2012-01-18 21:25:36.947963342 -0800
> @@ -605,11 +605,11 @@ void *idr_get_next(struct idr *idp, int
>  	int n, max;
>  
>  	/* find first ent */
> -	n = idp->layers * IDR_BITS;
> -	max = 1 << n;
>  	p = rcu_dereference_raw(idp->top);
>  	if (!p)
>  		return NULL;
> +	n = (p->layer + 1) * IDR_BITS;
> +	max = 1 << n;
>  
>  	while (id < max) {
>  		while (n > 0 && p) {

Interesting, but should be a patch on its own.

Maybe other idr users can benefit from your idea as well, if patch is
labeled  "idr: allow idr_get_next() from rcu_read_lock" or something...

I suggest introducing idr_get_next_rcu() helper to make the check about
rcu cleaner.

idr_get_next_rcu(...)
{
	WARN_ON_ONCE(!rcu_read_lock_held());
	return idr_get_next(...);
}




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] memcg: restore ss->id_lock to spinlock, using RCU for next
  2012-01-19  7:33 ` Eric Dumazet
@ 2012-01-19 12:28   ` Tejun Heo
  2012-01-19 13:30     ` Eric Dumazet
  0 siblings, 1 reply; 14+ messages in thread
From: Tejun Heo @ 2012-01-19 12:28 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Hugh Dickins, Li Zefan, Andrew Morton, Manfred Spraul,
	KAMEZAWA Hiroyuki, Johannes Weiner, Ying Han, Greg Thelen,
	cgroups, linux-mm, linux-kernel

Hello,

On Wed, Jan 18, 2012 at 11:33 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Interesting, but should be a patch on its own.

Yeap, agreed.

> Maybe other idr users can benefit from your idea as well, if patch is
> labeled  "idr: allow idr_get_next() from rcu_read_lock" or something...
>
> I suggest introducing idr_get_next_rcu() helper to make the check about
> rcu cleaner.
>
> idr_get_next_rcu(...)
> {
>        WARN_ON_ONCE(!rcu_read_lock_held());
>        return idr_get_next(...);
> }

Hmmm... I don't know. Does having a separate set of interface help
much?  It's easy to avoid/miss the test by using the other one.  If we
really worry about it, maybe indicating which locking is to be used
during init is better? We can remember the lockdep map and trigger
WARN_ON_ONCE() if neither the lock or RCU read lock is held.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] memcg: restore ss->id_lock to spinlock, using RCU for next
  2012-01-19 12:28   ` Tejun Heo
@ 2012-01-19 13:30     ` Eric Dumazet
  2012-01-19 20:46       ` Hugh Dickins
  0 siblings, 1 reply; 14+ messages in thread
From: Eric Dumazet @ 2012-01-19 13:30 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Hugh Dickins, Li Zefan, Andrew Morton, Manfred Spraul,
	KAMEZAWA Hiroyuki, Johannes Weiner, Ying Han, Greg Thelen,
	cgroups, linux-mm, linux-kernel

Le jeudi 19 janvier 2012 à 04:28 -0800, Tejun Heo a écrit :
> Hello,
> 
> On Wed, Jan 18, 2012 at 11:33 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > Interesting, but should be a patch on its own.
> 
> Yeap, agreed.
> 
> > Maybe other idr users can benefit from your idea as well, if patch is
> > labeled  "idr: allow idr_get_next() from rcu_read_lock" or something...
> >
> > I suggest introducing idr_get_next_rcu() helper to make the check about
> > rcu cleaner.
> >
> > idr_get_next_rcu(...)
> > {
> >        WARN_ON_ONCE(!rcu_read_lock_held());
> >        return idr_get_next(...);
> > }
> 
> Hmmm... I don't know. Does having a separate set of interface help
> much?  It's easy to avoid/miss the test by using the other one.  If we
> really worry about it, maybe indicating which locking is to be used
> during init is better? We can remember the lockdep map and trigger
> WARN_ON_ONCE() if neither the lock or RCU read lock is held.


There is a rcu_dereference_raw(ptr) in idr_get_next()

This could be changed to rcu_dereference_check(ptr, condition) to get
lockdep support for free :)

[ condition would be the appropriate
lockdep_is_held(&the_lock_protecting_my_idr) or 'I use the rcu variant'
and I hold rcu_read_lock ]

This would need to add a 'condition' parameter to idr_gen_next(), but we
have very few users in kernel at this moment.




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] memcg: restore ss->id_lock to spinlock, using RCU for next
  2012-01-19 13:30     ` Eric Dumazet
@ 2012-01-19 20:46       ` Hugh Dickins
  2012-01-19 20:48         ` [PATCH 1/3] idr: make idr_get_next() good for rcu_read_lock() Hugh Dickins
                           ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Hugh Dickins @ 2012-01-19 20:46 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Eric Dumazet, Li Zefan, Andrew Morton, Manfred Spraul,
	KAMEZAWA Hiroyuki, Johannes Weiner, Ying Han, Greg Thelen,
	cgroups, linux-mm, linux-kernel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2013 bytes --]

On Thu, 19 Jan 2012, Eric Dumazet wrote:
> Le jeudi 19 janvier 2012 à 04:28 -0800, Tejun Heo a écrit :
> > Hello,
> > 
> > On Wed, Jan 18, 2012 at 11:33 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > > Interesting, but should be a patch on its own.
> > 
> > Yeap, agreed.

Okay, in that case I'd better split into three (idr, revert, remove lock).
I'll send those three in a moment.  I've also slipped an RCU comment from
idr_find into idr_get_next, and put the Acks in all three.

> > 
> > > Maybe other idr users can benefit from your idea as well, if patch is
> > > labeled  "idr: allow idr_get_next() from rcu_read_lock" or something...
> > >
> > > I suggest introducing idr_get_next_rcu() helper to make the check about
> > > rcu cleaner.
> > >
> > > idr_get_next_rcu(...)
> > > {
> > >        WARN_ON_ONCE(!rcu_read_lock_held());
> > >        return idr_get_next(...);
> > > }
> > 
> > Hmmm... I don't know. Does having a separate set of interface help
> > much?  It's easy to avoid/miss the test by using the other one.  If we
> > really worry about it, maybe indicating which locking is to be used
> > during init is better? We can remember the lockdep map and trigger
> > WARN_ON_ONCE() if neither the lock or RCU read lock is held.
> 
> 
> There is a rcu_dereference_raw(ptr) in idr_get_next()
> 
> This could be changed to rcu_dereference_check(ptr, condition) to get
> lockdep support for free :)
> 
> [ condition would be the appropriate
> lockdep_is_held(&the_lock_protecting_my_idr) or 'I use the rcu variant'
> and I hold rcu_read_lock ]
> 
> This would need to add a 'condition' parameter to idr_gen_next(), but we
> have very few users in kernel at this moment.

idr_get_next() was introduced for memcg, and has only one other user
user in the tree (drivers/mtd/mtdcore.c, which uses a mutex to lock it).
With the RCU fix, idr_get_next() becomes very much like idr_find().
I'll leave any fiddling with their interfaces to you guys.

Hugh

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/3] idr: make idr_get_next() good for rcu_read_lock()
  2012-01-19 20:46       ` Hugh Dickins
@ 2012-01-19 20:48         ` Hugh Dickins
  2012-01-20 23:48           ` Andrew Morton
  2012-01-19 20:50         ` [PATCH 2/3] cgroup: revert ss_id_lock to spinlock Hugh Dickins
  2012-01-19 20:51         ` [PATCH 3/3] memcg: let css_get_next() rely upon rcu_read_lock() Hugh Dickins
  2 siblings, 1 reply; 14+ messages in thread
From: Hugh Dickins @ 2012-01-19 20:48 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Eric Dumazet, Li Zefan, Andrew Morton, Manfred Spraul,
	KAMEZAWA Hiroyuki, Johannes Weiner, Ying Han, Greg Thelen,
	cgroups, linux-mm, linux-kernel

Make one small adjustment to idr_get_next(): take the height from the
top layer (stable under RCU) instead of from the root (unprotected by
RCU), as idr_find() does: so that it can be used with RCU locking.
Copied comment on RCU locking from idr_find().

Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Li Zefan <lizf@cn.fujitsu.com>
---
 lib/idr.c |    8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

--- 3.2.0+.orig/lib/idr.c	2012-01-04 15:55:44.000000000 -0800
+++ 3.2.0+/lib/idr.c	2012-01-19 11:55:28.780206713 -0800
@@ -595,8 +595,10 @@ EXPORT_SYMBOL(idr_for_each);
  * Returns pointer to registered object with id, which is next number to
  * given id. After being looked up, *@nextidp will be updated for the next
  * iteration.
+ *
+ * This function can be called under rcu_read_lock(), given that the leaf
+ * pointers lifetimes are correctly managed.
  */
-
 void *idr_get_next(struct idr *idp, int *nextidp)
 {
 	struct idr_layer *p, *pa[MAX_LEVEL];
@@ -605,11 +607,11 @@ void *idr_get_next(struct idr *idp, int
 	int n, max;
 
 	/* find first ent */
-	n = idp->layers * IDR_BITS;
-	max = 1 << n;
 	p = rcu_dereference_raw(idp->top);
 	if (!p)
 		return NULL;
+	n = (p->layer + 1) * IDR_BITS;
+	max = 1 << n;
 
 	while (id < max) {
 		while (n > 0 && p) {

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 2/3] cgroup: revert ss_id_lock to spinlock
  2012-01-19 20:46       ` Hugh Dickins
  2012-01-19 20:48         ` [PATCH 1/3] idr: make idr_get_next() good for rcu_read_lock() Hugh Dickins
@ 2012-01-19 20:50         ` Hugh Dickins
  2012-01-19 20:51         ` [PATCH 3/3] memcg: let css_get_next() rely upon rcu_read_lock() Hugh Dickins
  2 siblings, 0 replies; 14+ messages in thread
From: Hugh Dickins @ 2012-01-19 20:50 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Eric Dumazet, Li Zefan, Andrew Morton, Manfred Spraul,
	KAMEZAWA Hiroyuki, Johannes Weiner, Ying Han, Greg Thelen,
	cgroups, linux-mm, linux-kernel

Commit c1e2ee2dc436 "memcg: replace ss->id_lock with a rwlock" has
now been seen to cause the unfair behavior we should have expected
from converting a spinlock to an rwlock: softlockup in cgroup_mkdir(),
whose get_new_cssid() is waiting for the wlock, while there are 19
tasks using the rlock in css_get_next() to get on with their memcg
workload (in an artificial test, admittedly).  Yet lib/idr.c was
made suitable for RCU way back: revert that commit, restoring
ss->id_lock to a spinlock.

Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Li Zefan <lizf@cn.fujitsu.com>
---
 include/linux/cgroup.h |    2 +-
 kernel/cgroup.c        |   18 +++++++++---------
 2 files changed, 10 insertions(+), 10 deletions(-)

--- 3.2.0+.orig/include/linux/cgroup.h	2012-01-14 13:01:57.000000000 -0800
+++ 3.2.0+/include/linux/cgroup.h	2012-01-19 12:14:47.420233522 -0800
@@ -535,7 +535,7 @@ struct cgroup_subsys {
 	struct list_head sibling;
 	/* used when use_id == true */
 	struct idr idr;
-	rwlock_t id_lock;
+	spinlock_t id_lock;
 
 	/* should be defined only by modular subsystems */
 	struct module *module;
--- 3.2.0+.orig/kernel/cgroup.c	2012-01-14 13:01:57.000000000 -0800
+++ 3.2.0+/kernel/cgroup.c	2012-01-19 12:16:04.132235263 -0800
@@ -4939,9 +4939,9 @@ void free_css_id(struct cgroup_subsys *s
 
 	rcu_assign_pointer(id->css, NULL);
 	rcu_assign_pointer(css->id, NULL);
-	write_lock(&ss->id_lock);
+	spin_lock(&ss->id_lock);
 	idr_remove(&ss->idr, id->id);
-	write_unlock(&ss->id_lock);
+	spin_unlock(&ss->id_lock);
 	kfree_rcu(id, rcu_head);
 }
 EXPORT_SYMBOL_GPL(free_css_id);
@@ -4967,10 +4967,10 @@ static struct css_id *get_new_cssid(stru
 		error = -ENOMEM;
 		goto err_out;
 	}
-	write_lock(&ss->id_lock);
+	spin_lock(&ss->id_lock);
 	/* Don't use 0. allocates an ID of 1-65535 */
 	error = idr_get_new_above(&ss->idr, newid, 1, &myid);
-	write_unlock(&ss->id_lock);
+	spin_unlock(&ss->id_lock);
 
 	/* Returns error when there are no free spaces for new ID.*/
 	if (error) {
@@ -4985,9 +4985,9 @@ static struct css_id *get_new_cssid(stru
 	return newid;
 remove_idr:
 	error = -ENOSPC;
-	write_lock(&ss->id_lock);
+	spin_lock(&ss->id_lock);
 	idr_remove(&ss->idr, myid);
-	write_unlock(&ss->id_lock);
+	spin_unlock(&ss->id_lock);
 err_out:
 	kfree(newid);
 	return ERR_PTR(error);
@@ -4999,7 +4999,7 @@ static int __init_or_module cgroup_init_
 {
 	struct css_id *newid;
 
-	rwlock_init(&ss->id_lock);
+	spin_lock_init(&ss->id_lock);
 	idr_init(&ss->idr);
 
 	newid = get_new_cssid(ss, 0);
@@ -5094,9 +5094,9 @@ css_get_next(struct cgroup_subsys *ss, i
 		 * scan next entry from bitmap(tree), tmpid is updated after
 		 * idr_get_next().
 		 */
-		read_lock(&ss->id_lock);
+		spin_lock(&ss->id_lock);
 		tmp = idr_get_next(&ss->idr, &tmpid);
-		read_unlock(&ss->id_lock);
+		spin_unlock(&ss->id_lock);
 
 		if (!tmp)
 			break;

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 3/3] memcg: let css_get_next() rely upon rcu_read_lock()
  2012-01-19 20:46       ` Hugh Dickins
  2012-01-19 20:48         ` [PATCH 1/3] idr: make idr_get_next() good for rcu_read_lock() Hugh Dickins
  2012-01-19 20:50         ` [PATCH 2/3] cgroup: revert ss_id_lock to spinlock Hugh Dickins
@ 2012-01-19 20:51         ` Hugh Dickins
  2012-01-19 20:53           ` Tejun Heo
  2 siblings, 1 reply; 14+ messages in thread
From: Hugh Dickins @ 2012-01-19 20:51 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Eric Dumazet, Li Zefan, Andrew Morton, Manfred Spraul,
	KAMEZAWA Hiroyuki, Johannes Weiner, Ying Han, Greg Thelen,
	cgroups, linux-mm, linux-kernel

Remove lock and unlock around css_get_next()'s call to idr_get_next().
memcg iterators (only users of css_get_next) already did rcu_read_lock(),
and its comment demands that; but add a WARN_ON_ONCE to make sure of it.

Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Li Zefan <lizf@cn.fujitsu.com>
---
 kernel/cgroup.c |    5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

--- 3.2.0+.orig/kernel/cgroup.c	2012-01-19 12:16:04.000000000 -0800
+++ 3.2.0+/kernel/cgroup.c	2012-01-19 12:22:55.188244820 -0800
@@ -5087,6 +5087,8 @@ css_get_next(struct cgroup_subsys *ss, i
 		return NULL;
 
 	BUG_ON(!ss->use_id);
+	WARN_ON_ONCE(!rcu_read_lock_held());
+
 	/* fill start point for scan */
 	tmpid = id;
 	while (1) {
@@ -5094,10 +5096,7 @@ css_get_next(struct cgroup_subsys *ss, i
 		 * scan next entry from bitmap(tree), tmpid is updated after
 		 * idr_get_next().
 		 */
-		spin_lock(&ss->id_lock);
 		tmp = idr_get_next(&ss->idr, &tmpid);
-		spin_unlock(&ss->id_lock);
-
 		if (!tmp)
 			break;
 		if (tmp->depth >= depth && tmp->stack[depth] == rootid) {

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/3] memcg: let css_get_next() rely upon rcu_read_lock()
  2012-01-19 20:51         ` [PATCH 3/3] memcg: let css_get_next() rely upon rcu_read_lock() Hugh Dickins
@ 2012-01-19 20:53           ` Tejun Heo
  0 siblings, 0 replies; 14+ messages in thread
From: Tejun Heo @ 2012-01-19 20:53 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Eric Dumazet, Li Zefan, Andrew Morton, Manfred Spraul,
	KAMEZAWA Hiroyuki, Johannes Weiner, Ying Han, Greg Thelen,
	cgroups, linux-mm, linux-kernel

On Thu, Jan 19, 2012 at 12:51:47PM -0800, Hugh Dickins wrote:
> Remove lock and unlock around css_get_next()'s call to idr_get_next().
> memcg iterators (only users of css_get_next) already did rcu_read_lock(),
> and its comment demands that; but add a WARN_ON_ONCE to make sure of it.
> 
> Signed-off-by: Hugh Dickins <hughd@google.com>
> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Acked-by: Li Zefan <lizf@cn.fujitsu.com>

All three look good to me,

 Acked-by: Tejun Heo <tj@kernel.org>

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] idr: make idr_get_next() good for rcu_read_lock()
  2012-01-19 20:48         ` [PATCH 1/3] idr: make idr_get_next() good for rcu_read_lock() Hugh Dickins
@ 2012-01-20 23:48           ` Andrew Morton
  2012-01-21  3:45             ` Hugh Dickins
  0 siblings, 1 reply; 14+ messages in thread
From: Andrew Morton @ 2012-01-20 23:48 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Tejun Heo, Eric Dumazet, Li Zefan, Manfred Spraul,
	KAMEZAWA Hiroyuki, Johannes Weiner, Ying Han, Greg Thelen,
	cgroups, linux-mm, linux-kernel

On Thu, 19 Jan 2012 12:48:48 -0800 (PST)
Hugh Dickins <hughd@google.com> wrote:

> Make one small adjustment to idr_get_next(): take the height from the
> top layer (stable under RCU) instead of from the root (unprotected by
> RCU), as idr_find() does: so that it can be used with RCU locking.
> Copied comment on RCU locking from idr_find().
> 
> Signed-off-by: Hugh Dickins <hughd@google.com>
> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Acked-by: Li Zefan <lizf@cn.fujitsu.com>
> ---
>  lib/idr.c |    8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> --- 3.2.0+.orig/lib/idr.c	2012-01-04 15:55:44.000000000 -0800
> +++ 3.2.0+/lib/idr.c	2012-01-19 11:55:28.780206713 -0800
> @@ -595,8 +595,10 @@ EXPORT_SYMBOL(idr_for_each);
>   * Returns pointer to registered object with id, which is next number to
>   * given id. After being looked up, *@nextidp will be updated for the next
>   * iteration.
> + *
> + * This function can be called under rcu_read_lock(), given that the leaf
> + * pointers lifetimes are correctly managed.

Awkward comment.  It translates to "..., because the leaf pointers
lifetimes are correctly managed".

Is that what we really meant?  Or did we mean "..., provided the leaf
pointers lifetimes are correctly managed"?

Also, "pointers" should have been "pointer" or "pointer's"!



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] idr: make idr_get_next() good for rcu_read_lock()
  2012-01-20 23:48           ` Andrew Morton
@ 2012-01-21  3:45             ` Hugh Dickins
  2012-01-21 20:43               ` Randy Dunlap
  0 siblings, 1 reply; 14+ messages in thread
From: Hugh Dickins @ 2012-01-21  3:45 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Tejun Heo, Eric Dumazet, Li Zefan, Manfred Spraul,
	KAMEZAWA Hiroyuki, Johannes Weiner, Ying Han, Greg Thelen,
	cgroups, linux-mm, linux-kernel

On Fri, 20 Jan 2012, Andrew Morton wrote:
> On Thu, 19 Jan 2012 12:48:48 -0800 (PST)
> Hugh Dickins <hughd@google.com> wrote:
> > Copied comment on RCU locking from idr_find().
> > 
> > + *
> > + * This function can be called under rcu_read_lock(), given that the leaf
> > + * pointers lifetimes are correctly managed.
> 
> Awkward comment.  It translates to "..., because the leaf pointers
> lifetimes are correctly managed".
> 
> Is that what we really meant?  Or did we mean "..., provided the leaf
> pointers lifetimes are correctly managed"?

You are right, and part of me realized that even as I copied in the
comment.  I wanted to express the same optimism for idr_get_next() 
as was already expressed for idr_find() - whatever it meant ;)

I thought it was meaning a bit of both: idr.c is managing its end well
enough that rcu_read_lock() can now be used, but the caller has to
manage their locking and lifetimes appropriately too.

> 
> Also, "pointers" should have been "pointer" or "pointer's"!

You're afraid of Linus turning his "its/it's" wrath from Al to yourself.

Since "lifetimes" is in the plural, I think it would have to be
"pointers'" - I _think_ that's correct, rather than "pointers's".

But then, it's not the lifetimes of the pointers, but the lifetimes
of the objects that they point to, that's in question.  So what it
ought to say is...

... falls asleep.

Hugh

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] idr: make idr_get_next() good for rcu_read_lock()
  2012-01-21  3:45             ` Hugh Dickins
@ 2012-01-21 20:43               ` Randy Dunlap
  0 siblings, 0 replies; 14+ messages in thread
From: Randy Dunlap @ 2012-01-21 20:43 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Andrew Morton, Tejun Heo, Eric Dumazet, Li Zefan, Manfred Spraul,
	KAMEZAWA Hiroyuki, Johannes Weiner, Ying Han, Greg Thelen,
	cgroups, linux-mm, linux-kernel

On 01/20/2012 07:45 PM, Hugh Dickins wrote:
> On Fri, 20 Jan 2012, Andrew Morton wrote:
>> On Thu, 19 Jan 2012 12:48:48 -0800 (PST)
>> Hugh Dickins <hughd@google.com> wrote:
>>> Copied comment on RCU locking from idr_find().
>>>
>>> + *
>>> + * This function can be called under rcu_read_lock(), given that the leaf
>>> + * pointers lifetimes are correctly managed.
>>
>> Awkward comment.  It translates to "..., because the leaf pointers
>> lifetimes are correctly managed".
>>
>> Is that what we really meant?  Or did we mean "..., provided the leaf
>> pointers lifetimes are correctly managed"?
> 
> You are right, and part of me realized that even as I copied in the
> comment.  I wanted to express the same optimism for idr_get_next() 
> as was already expressed for idr_find() - whatever it meant ;)
> 
> I thought it was meaning a bit of both: idr.c is managing its end well
> enough that rcu_read_lock() can now be used, but the caller has to
> manage their locking and lifetimes appropriately too.
> 
>>
>> Also, "pointers" should have been "pointer" or "pointer's"!
> 
> You're afraid of Linus turning his "its/it's" wrath from Al to yourself.
> 
> Since "lifetimes" is in the plural, I think it would have to be
> "pointers'" - I _think_ that's correct, rather than "pointers's".

That seems correct to me also.

> But then, it's not the lifetimes of the pointers, but the lifetimes
> of the objects that they point to, that's in question.  So what it
> ought to say is...
> 
> ... falls asleep.

ack.

and thanks for doing all of that radix tree test harness work, Hugh.

-- 
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2012-01-21 19:44 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-01-19  6:05 [PATCH] memcg: restore ss->id_lock to spinlock, using RCU for next Hugh Dickins
2012-01-19  6:58 ` KAMEZAWA Hiroyuki
2012-01-19  7:31 ` Li Zefan
2012-01-19  7:33 ` Eric Dumazet
2012-01-19 12:28   ` Tejun Heo
2012-01-19 13:30     ` Eric Dumazet
2012-01-19 20:46       ` Hugh Dickins
2012-01-19 20:48         ` [PATCH 1/3] idr: make idr_get_next() good for rcu_read_lock() Hugh Dickins
2012-01-20 23:48           ` Andrew Morton
2012-01-21  3:45             ` Hugh Dickins
2012-01-21 20:43               ` Randy Dunlap
2012-01-19 20:50         ` [PATCH 2/3] cgroup: revert ss_id_lock to spinlock Hugh Dickins
2012-01-19 20:51         ` [PATCH 3/3] memcg: let css_get_next() rely upon rcu_read_lock() Hugh Dickins
2012-01-19 20:53           ` Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).