linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/3] cpuset: clean up cpuset hierarchy traversal code
@ 2008-07-28  2:47 Li Zefan
  2008-07-28  4:37 ` Paul Jackson
  2008-07-28  6:01 ` Paul Jackson
  0 siblings, 2 replies; 10+ messages in thread
From: Li Zefan @ 2008-07-28  2:47 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Paul Jackson, Paul Menage, Hidetoshi Seto, Lai Jiangshan, LKML

Use cpuset.stack_list rather than kfifo, so we avoid memory allocation
for kfifo.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 kernel/cpuset.c |   21 ++++++++-------------
 1 files changed, 8 insertions(+), 13 deletions(-)

diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 3624dc0..d0c57ac 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -54,7 +54,6 @@
 #include <asm/uaccess.h>
 #include <asm/atomic.h>
 #include <linux/mutex.h>
-#include <linux/kfifo.h>
 #include <linux/workqueue.h>
 #include <linux/cgroup.h>
 
@@ -532,7 +531,7 @@ update_domain_attr(struct sched_domain_attr *dattr, struct cpuset *c)
  * So the reverse nesting would risk an ABBA deadlock.
  *
  * The three key local variables below are:
- *    q  - a kfifo queue of cpuset pointers, used to implement a
+ *    q  - a linked-list queue of cpuset pointers, used to implement a
  *	   top-down scan of all cpusets.  This scan loads a pointer
  *	   to each cpuset marked is_sched_load_balance into the
  *	   array 'csa'.  For our purposes, rebuilding the schedulers
@@ -567,7 +566,7 @@ update_domain_attr(struct sched_domain_attr *dattr, struct cpuset *c)
 
 void rebuild_sched_domains(void)
 {
-	struct kfifo *q;	/* queue of cpusets to be scanned */
+	LIST_HEAD(q);		/* queue of cpusets to be scanned*/
 	struct cpuset *cp;	/* scans q */
 	struct cpuset **csa;	/* array of all cpuset ptrs */
 	int csn;		/* how many cpuset ptrs in csa so far */
@@ -577,7 +576,6 @@ void rebuild_sched_domains(void)
 	int ndoms;		/* number of sched domains in result */
 	int nslot;		/* next empty doms[] cpumask_t slot */
 
-	q = NULL;
 	csa = NULL;
 	doms = NULL;
 	dattr = NULL;
@@ -597,20 +595,19 @@ void rebuild_sched_domains(void)
 		goto rebuild;
 	}
 
-	q = kfifo_alloc(number_of_cpusets * sizeof(cp), GFP_KERNEL, NULL);
-	if (IS_ERR(q))
-		goto done;
 	csa = kmalloc(number_of_cpusets * sizeof(cp), GFP_KERNEL);
 	if (!csa)
 		goto done;
 	csn = 0;
 
-	cp = &top_cpuset;
-	__kfifo_put(q, (void *)&cp, sizeof(cp));
-	while (__kfifo_get(q, (void *)&cp, sizeof(cp))) {
+	list_add(&top_cpuset.stack_list, &q);
+	while (!list_empty(&q)) {
 		struct cgroup *cont;
 		struct cpuset *child;   /* scans child cpusets of cp */
 
+		cp = list_first_entry(&q, struct cpuset, stack_list);
+		list_del(q.next);
+
 		if (cpus_empty(cp->cpus_allowed))
 			continue;
 
@@ -619,7 +616,7 @@ void rebuild_sched_domains(void)
 
 		list_for_each_entry(cont, &cp->css.cgroup->children, sibling) {
 			child = cgroup_cs(cont);
-			__kfifo_put(q, (void *)&child, sizeof(cp));
+			list_add_tail(&child->stack_list, &q);
 		}
   	}
 
@@ -702,8 +699,6 @@ rebuild:
 	put_online_cpus();
 
 done:
-	if (q && !IS_ERR(q))
-		kfifo_free(q);
 	kfree(csa);
 	/* Don't kfree(doms) -- partition_sched_domains() does that. */
 	/* Don't kfree(dattr) -- partition_sched_domains() does that. */
-- 
1.5.4.rc3

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/3] cpuset: clean up cpuset hierarchy traversal code
  2008-07-28  2:47 [PATCH 1/3] cpuset: clean up cpuset hierarchy traversal code Li Zefan
@ 2008-07-28  4:37 ` Paul Jackson
  2008-07-28  5:12   ` Paul Jackson
  2008-07-28  6:01 ` Paul Jackson
  1 sibling, 1 reply; 10+ messages in thread
From: Paul Jackson @ 2008-07-28  4:37 UTC (permalink / raw)
  To: Li Zefan; +Cc: akpm, menage, seto.hidetoshi, laijs, linux-kernel

Li Zefan wrote:
> +	list_add(&top_cpuset.stack_list, &q);

Can you figure out if we are holding any system wide lock, such as
cgroup_mutex, at this point?  There is only one top_cpuset.stack_list
in the system, so I would think that we need to single thread its usage
somehow.

The hotplug hooks, such as common_cpu_mem_hotplug_unplug() that were
added sometime in the last few months (while the so called maintainer,
who's initials seem to be 'pj', was asleep ;) don't describe what
locking applies to them very well, and they call rebuild_sched_domains(),
where the above line of code lives.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.940.382.4214

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/3] cpuset: clean up cpuset hierarchy traversal code
  2008-07-28  4:37 ` Paul Jackson
@ 2008-07-28  5:12   ` Paul Jackson
  2008-07-29  1:44     ` Li Zefan
  0 siblings, 1 reply; 10+ messages in thread
From: Paul Jackson @ 2008-07-28  5:12 UTC (permalink / raw)
  To: Paul Jackson; +Cc: lizf, akpm, menage, seto.hidetoshi, laijs, linux-kernel

pj wrote:
> Can you figure out if we are holding any system wide lock, such as
> cgroup_mutex, at this point?

Nevermind ... this is in rebuild_sched_domains(), which must be
called with cgroup_mutex held.  So we danged well better be holding
a system wide lock.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.940.382.4214

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/3] cpuset: clean up cpuset hierarchy traversal code
  2008-07-28  2:47 [PATCH 1/3] cpuset: clean up cpuset hierarchy traversal code Li Zefan
  2008-07-28  4:37 ` Paul Jackson
@ 2008-07-28  6:01 ` Paul Jackson
  2008-07-28 17:20   ` Max Krasnyansky
  2008-07-29  1:30   ` Li Zefan
  1 sibling, 2 replies; 10+ messages in thread
From: Paul Jackson @ 2008-07-28  6:01 UTC (permalink / raw)
  To: Li Zefan; +Cc: akpm, menage, seto.hidetoshi, laijs, linux-kernel

Li Zefan wrote:
> -	q = kfifo_alloc(number_of_cpusets * sizeof(cp), GFP_KERNEL, NULL);


The block comment for rebuild_sched_domains() states:

> ... May take callback_mutex during
> * call due to the kfifo_alloc() and kmalloc() calls.

I suspect that mention of kfifo_alloc() is no longer correct,
with your patch.  If so, perhaps you could send a little additional
fix patch, to remove that mention from the comment.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.940.382.4214

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/3] cpuset: clean up cpuset hierarchy traversal code
  2008-07-28  6:01 ` Paul Jackson
@ 2008-07-28 17:20   ` Max Krasnyansky
  2008-07-29  1:25     ` Li Zefan
  2008-07-30 10:12     ` Paul Jackson
  2008-07-29  1:30   ` Li Zefan
  1 sibling, 2 replies; 10+ messages in thread
From: Max Krasnyansky @ 2008-07-28 17:20 UTC (permalink / raw)
  To: Paul Jackson; +Cc: Li Zefan, akpm, menage, seto.hidetoshi, laijs, linux-kernel

Paul Jackson wrote:
> Li Zefan wrote:
>> -	q = kfifo_alloc(number_of_cpusets * sizeof(cp), GFP_KERNEL, NULL);
> 
> 
> The block comment for rebuild_sched_domains() states:
> 
>> ... May take callback_mutex during
>> * call due to the kfifo_alloc() and kmalloc() calls.
> 
> I suspect that mention of kfifo_alloc() is no longer correct,
> with your patch.  If so, perhaps you could send a little additional
> fix patch, to remove that mention from the comment.
> 

Paul, please take a look at
	cpuset: Rework sched domains and CPU hotplug handling
patch I sent out last week.
I'd appreciate if we applied that one first. It simplifies lock nesting 
and rearranges the way sched domains are rebuilt. It is IMO a bit higher 
priority than this patch because scheduler depends on the 
rebuild_sched_domains() call and we have to call rebuild_sched_domains() 
from cpu hotplug handlers.

Max


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/3] cpuset: clean up cpuset hierarchy traversal code
  2008-07-28 17:20   ` Max Krasnyansky
@ 2008-07-29  1:25     ` Li Zefan
  2008-07-30 10:12     ` Paul Jackson
  1 sibling, 0 replies; 10+ messages in thread
From: Li Zefan @ 2008-07-29  1:25 UTC (permalink / raw)
  To: Max Krasnyansky
  Cc: Paul Jackson, akpm, menage, seto.hidetoshi, laijs, linux-kernel

> Paul, please take a look at
>     cpuset: Rework sched domains and CPU hotplug handling
> patch I sent out last week.
> I'd appreciate if we applied that one first. It simplifies lock nesting
> and rearranges the way sched domains are rebuilt. It is IMO a bit higher
> priority than this patch because scheduler depends on the
> rebuild_sched_domains() call and we have to call rebuild_sched_domains()
> from cpu hotplug handlers.
> 

I don't mind which one gets first. :)


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/3] cpuset: clean up cpuset hierarchy traversal code
  2008-07-28  6:01 ` Paul Jackson
  2008-07-28 17:20   ` Max Krasnyansky
@ 2008-07-29  1:30   ` Li Zefan
  2008-07-29 13:04     ` Paul Jackson
  1 sibling, 1 reply; 10+ messages in thread
From: Li Zefan @ 2008-07-29  1:30 UTC (permalink / raw)
  To: Paul Jackson; +Cc: akpm, menage, seto.hidetoshi, laijs, linux-kernel

Paul Jackson wrote:
> Li Zefan wrote:
>> -	q = kfifo_alloc(number_of_cpusets * sizeof(cp), GFP_KERNEL, NULL);
> 
> 
> The block comment for rebuild_sched_domains() states:
> 
>> ... May take callback_mutex during
>> * call due to the kfifo_alloc() and kmalloc() calls.
> 
> I suspect that mention of kfifo_alloc() is no longer correct,
> with your patch.  If so, perhaps you could send a little additional
> fix patch, to remove that mention from the comment.
> 

Yes, but I just noticed Max removes that comment completely in his patch,
so I guess I can leave it as it is.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/3] cpuset: clean up cpuset hierarchy traversal code
  2008-07-28  5:12   ` Paul Jackson
@ 2008-07-29  1:44     ` Li Zefan
  0 siblings, 0 replies; 10+ messages in thread
From: Li Zefan @ 2008-07-29  1:44 UTC (permalink / raw)
  To: Paul Jackson; +Cc: akpm, menage, seto.hidetoshi, laijs, linux-kernel

Paul Jackson wrote:
> pj wrote:
>> Can you figure out if we are holding any system wide lock, such as
>> cgroup_mutex, at this point?
> 
> Nevermind ... this is in rebuild_sched_domains(), which must be
> called with cgroup_mutex held.  So we danged well better be holding
> a system wide lock.
> 

Yes, and even after Max's patch, the code playing with ->stack_list is
still protected by cgroup_mutex. :)


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/3] cpuset: clean up cpuset hierarchy traversal code
  2008-07-29  1:30   ` Li Zefan
@ 2008-07-29 13:04     ` Paul Jackson
  0 siblings, 0 replies; 10+ messages in thread
From: Paul Jackson @ 2008-07-29 13:04 UTC (permalink / raw)
  To: Li Zefan; +Cc: akpm, menage, seto.hidetoshi, laijs, linux-kernel

Li Zefan wrote:
> Yes, but I just noticed Max removes that comment completely in his patch,
> so I guess I can leave it as it is.

Yeah - leave it as it is.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.940.382.4214

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/3] cpuset: clean up cpuset hierarchy traversal code
  2008-07-28 17:20   ` Max Krasnyansky
  2008-07-29  1:25     ` Li Zefan
@ 2008-07-30 10:12     ` Paul Jackson
  1 sibling, 0 replies; 10+ messages in thread
From: Paul Jackson @ 2008-07-30 10:12 UTC (permalink / raw)
  To: Max Krasnyansky; +Cc: lizf, akpm, menage, seto.hidetoshi, laijs, linux-kernel

Max wrote:
> Paul, please take a look at
> 	cpuset: Rework sched domains and CPU hotplug handling
> patch I sent out last week.

Ok - I've been trying to get to this for the last two days,
but my good employer has reorganized me twice so far this week ;).

If I can keep the same boss for more than eight hours,
I should find time to get to this today.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.940.382.4214

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2008-07-30 10:13 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-07-28  2:47 [PATCH 1/3] cpuset: clean up cpuset hierarchy traversal code Li Zefan
2008-07-28  4:37 ` Paul Jackson
2008-07-28  5:12   ` Paul Jackson
2008-07-29  1:44     ` Li Zefan
2008-07-28  6:01 ` Paul Jackson
2008-07-28 17:20   ` Max Krasnyansky
2008-07-29  1:25     ` Li Zefan
2008-07-30 10:12     ` Paul Jackson
2008-07-29  1:30   ` Li Zefan
2008-07-29 13:04     ` Paul Jackson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).