linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: v2.6.21.4-rt11
       [not found]                 ` <20070616161213.GA2994@linux.vnet.ibm.com>
@ 2007-06-18 15:12                   ` Srivatsa Vaddagiri
  2007-06-18 16:54                     ` v2.6.21.4-rt11 Christoph Lameter
                                       ` (2 more replies)
  0 siblings, 3 replies; 39+ messages in thread
From: Srivatsa Vaddagiri @ 2007-06-18 15:12 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Ingo Molnar, Thomas Gleixner, Dinakar Guniguntala,
	Dmitry Adamushko, suresh.b.siddha, pwil3058, clameter,
	linux-kernel, akpm

On Sat, Jun 16, 2007 at 09:12:13AM -0700, Paul E. McKenney wrote:
> On Sat, Jun 16, 2007 at 02:14:34PM +0530, Srivatsa Vaddagiri wrote:
> > On Fri, Jun 15, 2007 at 06:16:05PM -0700, Paul E. McKenney wrote:
> > > On Fri, Jun 15, 2007 at 09:55:45PM +0200, Ingo Molnar wrote:
> > > > 
> > > > * Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> > > > 
> > > > > > to make sure it's not some effect in -rt causing this. v17 has an 
> > > > > > updated load balancing code. (which might or might not affect the 
> > > > > > rcutorture problem.)
> > > > > 
> > > > > Good point!  I will try the following:
> > > > > 
> > > > > 1.	Stock 2.6.21.5.
> > > > > 
> > > > > 2.	2.6.21-rt14.
> > > > > 
> > > > > 3.	2.6.21.5 + sched-cfs-v2.6.21.5-v17.patch
> > > > > 
> > > > > And quickly, before everyone else jumps on the machines that show the 
> > > > > problem.  ;-)
> > > > 
> > > > thanks! It's enough to check whether modprobe rcutorture still produces 
> > > > that weird balancing problem. That clearly has to be fixed ...
> > > > 
> > > > And i've Cc:-ed Dmitry and Srivatsa, who are busy hacking this area of 
> > > > the CFS code as we speak :-)
> > > 
> > > Well, I am not sure that the info I was able to collect will be all
> > > that helpful, but it most certainly does confirm that the balancing
> > > problem that rcutorture produces is indeed weird...
> > 
> > Hi Paul, 
> > 	I tried on two machines in our lab and could not recreate your
> > problem.
> > 
> > On a 2way x86_64 AMD box and 2.6.21.5+cfsv17:
> > 
> > 
> >   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
> > 12395 root      39  19     0    0    0 R 50.3  0.0   0:57.62 rcu_torture_rea
> > 12394 root      39  19     0    0    0 R 49.9  0.0   0:57.29 rcu_torture_rea
> > 12396 root      39  19     0    0    0 R 49.9  0.0   0:56.96 rcu_torture_rea
> > 12397 root      39  19     0    0    0 R 49.9  0.0   0:56.90 rcu_torture_rea
> > 
> > On a 4way x86_64 Intel Xeon box and 2.6.21.5+cfsv17:
> > 
> >   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  P COMMAND
> >  6258 root      39  19     0    0    0 R   53  0.0  17:29.72 0 rcu_torture_rea
> >  6252 root      39  19     0    0    0 R   49  0.0  17:49.40 3 rcu_torture_rea
> >  6257 root      39  19     0    0    0 R   49  0.0  17:22.49 2 rcu_torture_rea
> >  6256 root      39  19     0    0    0 R   48  0.0  17:50.12 1 rcu_torture_rea
> >  6254 root      39  19     0    0    0 R   48  0.0  17:26.98 0 rcu_torture_rea
> >  6255 root      39  19     0    0    0 R   48  0.0  17:25.74 2 rcu_torture_rea
> >  6251 root      39  19     0    0    0 R   45  0.0  17:47.45 3 rcu_torture_rea
> >  6253 root      39  19     0    0    0 R   45  0.0  17:48.48 1 rcu_torture_rea
> > 
> > 
> > I will try this on few more boxes we have on Monday. If I can't recreate, then 
> > I may request you to provide me machine details (or even access to the problem 
> > box if it is in IBM labs and if I am allowed to login!)
> 
> elm3b6, ABAT job 95107.  There are others, this particular job uses
> 2.6.21.5-cfsv17.

Paul,
	I logged into elm3b6 and did some investigation. I think I have
a tentative patch to fix your load-balance problem.

First, an explanation of the problem:

This particular machine, elm3b6, is a 4-cpu, (gasp, yes!) 4-node box i.e 
each CPU is a node by itself. If you don't have CONFIG_NUMA enabled,
then we won't have cross-node (i.e cross-cpu) load balancing.
Fortunately in your case you had CONFIG_NUMA enabled, but still were
hitting the (gross) load imbalance.

The problem seems to be with idle_balance(). This particular routine,
invoked by schedule() on a idle cpu, walks up sched-domain hierarchy and
tries to balance in each domain that has SD_BALANCE_NEWIDLE flag set.
The nodes-level domain (SD_NODE_INIT) however doesn't set this flag,
which means idle cpu looks for (im)balance within its own node at most and
not beyond. Now, here's the problem, if the idle cpu doesn't find
imbalance within its node (pulled_tasks = 0), it resets this_rq->next_balance
so that next balancing activity is deferred for upto a minute
(next_balance = jiffies + 60 *  HZ). If a idle cpu calls idle_balance
again in the next minute and finds no imbalance within its node, it
-again- resets next_balance. In your case, I think this was happening
repetetively, which made other CPUs never look for cross-node
(im)balance.

I believe the patch below is correct. With the patch applied, I could
not recreate the imbalance with rcutorture. Let me know whether you
still see the problem with this patch applied on any other machine.

I have CCed others who have worked in this area and request them to review 
this patch.

Andrew,
	If there is no objection from anyone, request you to pick this
up for next -mm release. It has been tested against 2.6.22-rc4-mm2.


idle_balance() can erroneously cause system-wide imbalance to be overlooked
by reseting rq->next_balance. When called sufficient number of times, it
can forever defer system-wide load balance. Patch below modifies
idle_balance() not to mess with ->next_balance. If indeed it turns out
that there is no imbalance even system-wide, rebalance_domains() will
anyway set ->next_balance to happen after a minute.


Signed-off-by : Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>


Index: linux-2.6.22-rc4/kernel/sched.c
===================================================================
--- linux-2.6.22-rc4.orig/kernel/sched.c	2007-06-18 07:16:49.000000000 -0700
+++ linux-2.6.22-rc4/kernel/sched.c	2007-06-18 07:18:41.000000000 -0700
@@ -2490,27 +2490,16 @@
 {
 	struct sched_domain *sd;
 	int pulled_task = 0;
-	unsigned long next_balance = jiffies + 60 *  HZ;
 
 	for_each_domain(this_cpu, sd) {
 		if (sd->flags & SD_BALANCE_NEWIDLE) {
 			/* If we've pulled tasks over stop searching: */
 			pulled_task = load_balance_newidle(this_cpu,
 							this_rq, sd);
-			if (time_after(next_balance,
-				  sd->last_balance + sd->balance_interval))
-				next_balance = sd->last_balance
-						+ sd->balance_interval;
 			if (pulled_task)
 				break;
 		}
 	}
-	if (!pulled_task)
-		/*
-		 * We are going idle. next_balance may be set based on
-		 * a busy processor. So reset next_balance.
-		 */
-		this_rq->next_balance = next_balance;
 }
 
 /*


-- 
Regards,
vatsa

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: v2.6.21.4-rt11
  2007-06-18 15:12                   ` v2.6.21.4-rt11 Srivatsa Vaddagiri
@ 2007-06-18 16:54                     ` Christoph Lameter
  2007-06-18 17:35                       ` v2.6.21.4-rt11 Srivatsa Vaddagiri
  2007-06-18 18:06                     ` v2.6.21.4-rt11 Srivatsa Vaddagiri
  2007-06-19  9:04                     ` v2.6.21.4-rt11 Ingo Molnar
  2 siblings, 1 reply; 39+ messages in thread
From: Christoph Lameter @ 2007-06-18 16:54 UTC (permalink / raw)
  To: Srivatsa Vaddagiri
  Cc: Paul E. McKenney, Ingo Molnar, Thomas Gleixner,
	Dinakar Guniguntala, Dmitry Adamushko, suresh.b.siddha, pwil3058,
	linux-kernel, akpm

On Mon, 18 Jun 2007, Srivatsa Vaddagiri wrote:

> This particular machine, elm3b6, is a 4-cpu, (gasp, yes!) 4-node box i.e 
> each CPU is a node by itself. If you don't have CONFIG_NUMA enabled,
> then we won't have cross-node (i.e cross-cpu) load balancing.
> Fortunately in your case you had CONFIG_NUMA enabled, but still were
> hitting the (gross) load imbalance.
> 
> The problem seems to be with idle_balance(). This particular routine,
> invoked by schedule() on a idle cpu, walks up sched-domain hierarchy and
> tries to balance in each domain that has SD_BALANCE_NEWIDLE flag set.
> The nodes-level domain (SD_NODE_INIT) however doesn't set this flag,
> which means idle cpu looks for (im)balance within its own node at most and

The nodes-level domain looks for internode balances between up to 16 
nodes. It is not restricted to a single node. The balancing on the
phys_domain level does balance within a node.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: v2.6.21.4-rt11
  2007-06-18 16:54                     ` v2.6.21.4-rt11 Christoph Lameter
@ 2007-06-18 17:35                       ` Srivatsa Vaddagiri
  2007-06-18 17:59                         ` v2.6.21.4-rt11 Christoph Lameter
  0 siblings, 1 reply; 39+ messages in thread
From: Srivatsa Vaddagiri @ 2007-06-18 17:35 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Paul E. McKenney, Ingo Molnar, Thomas Gleixner,
	Dinakar Guniguntala, Dmitry Adamushko, suresh.b.siddha, pwil3058,
	linux-kernel, akpm

On Mon, Jun 18, 2007 at 09:54:18AM -0700, Christoph Lameter wrote:
> The nodes-level domain looks for internode balances between up to 16 
> nodes. It is not restricted to a single node.

I was mostly speaking with the example system in mind (4-node 4-cpu
box), but yes, node-level domain does look for imbalance across max 16
nodes as you mention.

Both node and all-node domains don't have SD_BALANCE_NEWIDLE set, which
means idle_balance() will stop looking for imbalance beyonds its own
node. Based on the observed balance within its own node, IMO,
idle_balance() should not cause ->next_balance to be reset.

-- 
Regards,
vatsa

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: v2.6.21.4-rt11
  2007-06-18 17:35                       ` v2.6.21.4-rt11 Srivatsa Vaddagiri
@ 2007-06-18 17:59                         ` Christoph Lameter
  2007-06-19  1:52                           ` v2.6.21.4-rt11 Srivatsa Vaddagiri
  2007-06-19  2:15                           ` v2.6.21.4-rt11 Siddha, Suresh B
  0 siblings, 2 replies; 39+ messages in thread
From: Christoph Lameter @ 2007-06-18 17:59 UTC (permalink / raw)
  To: Srivatsa Vaddagiri
  Cc: Paul E. McKenney, Ingo Molnar, Thomas Gleixner,
	Dinakar Guniguntala, Dmitry Adamushko, suresh.b.siddha, pwil3058,
	linux-kernel, akpm

On Mon, 18 Jun 2007, Srivatsa Vaddagiri wrote:

> On Mon, Jun 18, 2007 at 09:54:18AM -0700, Christoph Lameter wrote:
> > The nodes-level domain looks for internode balances between up to 16 
> > nodes. It is not restricted to a single node.
> 
> I was mostly speaking with the example system in mind (4-node 4-cpu
> box), but yes, node-level domain does look for imbalance across max 16
> nodes as you mention.
> 
> Both node and all-node domains don't have SD_BALANCE_NEWIDLE set, which
> means idle_balance() will stop looking for imbalance beyonds its own
> node. Based on the observed balance within its own node, IMO,
> idle_balance() should not cause ->next_balance to be reset.

I think the check in idle_balance needs to be modified.

If the domain *does not* have SD_BALANCE_NEWIDLE set then
next_balance must still be set right. Does this patch fix it?



Scheduler: Fix next_interval determination in idle_balance().

The intervals of domains that do not have SD_BALANCE_NEWIDLE must
be considered for the calculation of the time of the next balance. 
Otherwise we may defer rebalancing forever.

Signed-off-by: Christop Lameter <clameter@sgi.com>

Index: linux-2.6.22-rc4-mm2/kernel/sched.c
===================================================================
--- linux-2.6.22-rc4-mm2.orig/kernel/sched.c	2007-06-18 10:56:31.000000000 -0700
+++ linux-2.6.22-rc4-mm2/kernel/sched.c	2007-06-18 10:57:10.000000000 -0700
@@ -2493,17 +2493,16 @@ static void idle_balance(int this_cpu, s
 	unsigned long next_balance = jiffies + 60 *  HZ;
 
 	for_each_domain(this_cpu, sd) {
-		if (sd->flags & SD_BALANCE_NEWIDLE) {
+		if (sd->flags & SD_BALANCE_NEWIDLE)
 			/* If we've pulled tasks over stop searching: */
 			pulled_task = load_balance_newidle(this_cpu,
 							this_rq, sd);
-			if (time_after(next_balance,
-				  sd->last_balance + sd->balance_interval))
-				next_balance = sd->last_balance
-						+ sd->balance_interval;
-			if (pulled_task)
-				break;
-		}
+		if (time_after(next_balance,
+			  sd->last_balance + sd->balance_interval))
+			next_balance = sd->last_balance
+					+ sd->balance_interval;
+		if (pulled_task)
+			break;
 	}
 	if (!pulled_task)
 		/*

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: v2.6.21.4-rt11
  2007-06-18 15:12                   ` v2.6.21.4-rt11 Srivatsa Vaddagiri
  2007-06-18 16:54                     ` v2.6.21.4-rt11 Christoph Lameter
@ 2007-06-18 18:06                     ` Srivatsa Vaddagiri
  2007-06-19  9:04                     ` v2.6.21.4-rt11 Ingo Molnar
  2 siblings, 0 replies; 39+ messages in thread
From: Srivatsa Vaddagiri @ 2007-06-18 18:06 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Ingo Molnar, Thomas Gleixner, Dinakar Guniguntala,
	Dmitry Adamushko, suresh.b.siddha, pwil3058, clameter,
	linux-kernel, akpm

On Mon, Jun 18, 2007 at 08:42:15PM +0530, Srivatsa Vaddagiri wrote:
> If you don't have CONFIG_NUMA enabled,
> then we won't have cross-node (i.e cross-cpu) load balancing.

Mmm ..that is not correct. I found that disabling CONFIG_NUMA leads
to better load balance on the problem system (i.e w/o any patches
applied 2.6.22-rc4-mm2 leads to good distribution of rcu readers on all
4 cpus).

Anyway, the patch is still needed for scenarios like you originally
tested with.

-- 
Regards,
vatsa

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: v2.6.21.4-rt11
  2007-06-18 17:59                         ` v2.6.21.4-rt11 Christoph Lameter
@ 2007-06-19  1:52                           ` Srivatsa Vaddagiri
  2007-06-19  2:13                             ` v2.6.21.4-rt11 Siddha, Suresh B
  2007-06-19  2:15                           ` v2.6.21.4-rt11 Siddha, Suresh B
  1 sibling, 1 reply; 39+ messages in thread
From: Srivatsa Vaddagiri @ 2007-06-19  1:52 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Paul E. McKenney, Ingo Molnar, Thomas Gleixner,
	Dinakar Guniguntala, Dmitry Adamushko, suresh.b.siddha, pwil3058,
	linux-kernel, akpm

On Mon, Jun 18, 2007 at 10:59:21AM -0700, Christoph Lameter wrote:
> I think the check in idle_balance needs to be modified.
> 
> If the domain *does not* have SD_BALANCE_NEWIDLE set then
> next_balance must still be set right. Does this patch fix it?

Is the ->next_balance calculation in idle_balance() necessary at all?
rebalance_domains() would have programmed ->next_balance anyway, based
on the nearest next_balance point of all (load-balance'able) domains.
By repeating that calculation in idle_balance, are we covering any corner case?

-- 
Regards,
vatsa

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: v2.6.21.4-rt11
  2007-06-19  1:52                           ` v2.6.21.4-rt11 Srivatsa Vaddagiri
@ 2007-06-19  2:13                             ` Siddha, Suresh B
  0 siblings, 0 replies; 39+ messages in thread
From: Siddha, Suresh B @ 2007-06-19  2:13 UTC (permalink / raw)
  To: Srivatsa Vaddagiri
  Cc: Christoph Lameter, Paul E. McKenney, Ingo Molnar,
	Thomas Gleixner, Dinakar Guniguntala, Dmitry Adamushko,
	suresh.b.siddha, pwil3058, linux-kernel, akpm

On Tue, Jun 19, 2007 at 07:22:32AM +0530, Srivatsa Vaddagiri wrote:
> On Mon, Jun 18, 2007 at 10:59:21AM -0700, Christoph Lameter wrote:
> > I think the check in idle_balance needs to be modified.
> > 
> > If the domain *does not* have SD_BALANCE_NEWIDLE set then
> > next_balance must still be set right. Does this patch fix it?
> 
> Is the ->next_balance calculation in idle_balance() necessary at all?
> rebalance_domains() would have programmed ->next_balance anyway, based
> on the nearest next_balance point of all (load-balance'able) domains.
> By repeating that calculation in idle_balance, are we covering any corner case?

rebalance_domains() have programmed ->next_balance based on 'busy' state.
And now, as it is going to 'idle', this routine is recalculating
the next_balance based on 'idle' state.

thanks,
suresh

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: v2.6.21.4-rt11
  2007-06-18 17:59                         ` v2.6.21.4-rt11 Christoph Lameter
  2007-06-19  1:52                           ` v2.6.21.4-rt11 Srivatsa Vaddagiri
@ 2007-06-19  2:15                           ` Siddha, Suresh B
  2007-06-19  3:46                             ` v2.6.21.4-rt11 Christoph Lameter
  1 sibling, 1 reply; 39+ messages in thread
From: Siddha, Suresh B @ 2007-06-19  2:15 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Srivatsa Vaddagiri, Paul E. McKenney, Ingo Molnar,
	Thomas Gleixner, Dinakar Guniguntala, Dmitry Adamushko,
	suresh.b.siddha, pwil3058, linux-kernel, akpm

On Mon, Jun 18, 2007 at 10:59:21AM -0700, Christoph Lameter wrote:
>  	for_each_domain(this_cpu, sd) {
> -		if (sd->flags & SD_BALANCE_NEWIDLE) {
> +		if (sd->flags & SD_BALANCE_NEWIDLE)
>  			/* If we've pulled tasks over stop searching: */
>  			pulled_task = load_balance_newidle(this_cpu,
>  							this_rq, sd);
> -			if (time_after(next_balance,
> -				  sd->last_balance + sd->balance_interval))
> -				next_balance = sd->last_balance
> -						+ sd->balance_interval;
> -			if (pulled_task)
> -				break;
> -		}
> +		if (time_after(next_balance,
> +			  sd->last_balance + sd->balance_interval))
> +			next_balance = sd->last_balance
> +					+ sd->balance_interval;

don't we have to do, msecs_to_jiffies(sd->balance_interval)?

thanks,
suresh
> +		if (pulled_task)
> +			break;
>  	}
>  	if (!pulled_task)
>  		/*

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: v2.6.21.4-rt11
  2007-06-19  2:15                           ` v2.6.21.4-rt11 Siddha, Suresh B
@ 2007-06-19  3:46                             ` Christoph Lameter
  2007-06-19  5:49                               ` v2.6.21.4-rt11 Srivatsa Vaddagiri
  0 siblings, 1 reply; 39+ messages in thread
From: Christoph Lameter @ 2007-06-19  3:46 UTC (permalink / raw)
  To: Siddha, Suresh B
  Cc: Srivatsa Vaddagiri, Paul E. McKenney, Ingo Molnar,
	Thomas Gleixner, Dinakar Guniguntala, Dmitry Adamushko, pwil3058,
	linux-kernel, akpm

On Mon, 18 Jun 2007, Siddha, Suresh B wrote:

> > +		if (time_after(next_balance,
> > +			  sd->last_balance + sd->balance_interval))
> > +			next_balance = sd->last_balance
> > +					+ sd->balance_interval;
> 
> don't we have to do, msecs_to_jiffies(sd->balance_interval)?

Well that is certainly a bug here. Is this better?

Scheduler: Fix next_interval determination in idle_balance().

The intervals of domains that do not have SD_BALANCE_NEWIDLE must
be considered for the calculation of the time of the next balance.
Otherwise we may defer rebalancing forever.

Siddha also spotted that the conversion of the balance interval
to jiffies is missing. Fix that to.

Signed-off-by: Christop Lameter <clameter@sgi.com>

Index: linux-2.6.22-rc4-mm2/kernel/sched.c
===================================================================
--- linux-2.6.22-rc4-mm2.orig/kernel/sched.c	2007-06-18 20:41:46.000000000 -0700
+++ linux-2.6.22-rc4-mm2/kernel/sched.c	2007-06-18 20:44:00.000000000 -0700
@@ -2493,17 +2493,18 @@ static void idle_balance(int this_cpu, s
 	unsigned long next_balance = jiffies + 60 *  HZ;
 
 	for_each_domain(this_cpu, sd) {
-		if (sd->flags & SD_BALANCE_NEWIDLE) {
+		unsigned long interval;
+
+		if (sd->flags & SD_BALANCE_NEWIDLE)
 			/* If we've pulled tasks over stop searching: */
-			pulled_task = load_balance_newidle(this_cpu,
-							this_rq, sd);
-			if (time_after(next_balance,
-				  sd->last_balance + sd->balance_interval))
-				next_balance = sd->last_balance
-						+ sd->balance_interval;
-			if (pulled_task)
-				break;
-		}
+			pulled_task = load_balance_newidle(this_cpu,this_rq, sd);
+
+		interval = msecs_to_jiffies(sd->balance_interval);
+		if (time_after(next_balance,
+			  sd->last_balance + interval))
+			next_balance = sd->last_balance + interval;
+		if (pulled_task)
+			break;
 	}
 	if (!pulled_task)
 		/*


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: v2.6.21.4-rt11
  2007-06-19  3:46                             ` v2.6.21.4-rt11 Christoph Lameter
@ 2007-06-19  5:49                               ` Srivatsa Vaddagiri
  2007-06-19  8:07                                 ` v2.6.21.4-rt11 Ingo Molnar
  0 siblings, 1 reply; 39+ messages in thread
From: Srivatsa Vaddagiri @ 2007-06-19  5:49 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Siddha, Suresh B, Paul E. McKenney, Ingo Molnar, Thomas Gleixner,
	Dinakar Guniguntala, Dmitry Adamushko, pwil3058, linux-kernel,
	akpm

On Mon, Jun 18, 2007 at 08:46:03PM -0700, Christoph Lameter wrote:
> @@ -2493,17 +2493,18 @@ static void idle_balance(int this_cpu, s
>  	unsigned long next_balance = jiffies + 60 *  HZ;
> 
>  	for_each_domain(this_cpu, sd) {
> -		if (sd->flags & SD_BALANCE_NEWIDLE) {
> +		unsigned long interval;
> +

Do we need a :

		if (!(sd->flags & SD_LOAD_BALANCE))
			continue;

here?

Otherwise patch look good and fixes the problem Paul observed earlier.

> +		if (sd->flags & SD_BALANCE_NEWIDLE)
>  			/* If we've pulled tasks over stop searching: */
> -			pulled_task = load_balance_newidle(this_cpu,
> -							this_rq, sd);
> -			if (time_after(next_balance,
> -				  sd->last_balance + sd->balance_interval))
> -				next_balance = sd->last_balance
> -						+ sd->balance_interval;
> -			if (pulled_task)
> -				break;
> -		}
> +			pulled_task = load_balance_newidle(this_cpu,this_rq, sd);
> +
> +		interval = msecs_to_jiffies(sd->balance_interval);


> +		if (time_after(next_balance,
> +			  sd->last_balance + interval))
> +			next_balance = sd->last_balance + interval;
> +		if (pulled_task)
> +			break;
>  	}
>  	if (!pulled_task)
>  		/*

-- 
Regards,
vatsa

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: v2.6.21.4-rt11
  2007-06-19  5:49                               ` v2.6.21.4-rt11 Srivatsa Vaddagiri
@ 2007-06-19  8:07                                 ` Ingo Molnar
  0 siblings, 0 replies; 39+ messages in thread
From: Ingo Molnar @ 2007-06-19  8:07 UTC (permalink / raw)
  To: Srivatsa Vaddagiri
  Cc: Christoph Lameter, Siddha, Suresh B, Paul E. McKenney,
	Thomas Gleixner, Dinakar Guniguntala, Dmitry Adamushko, pwil3058,
	linux-kernel, akpm


* Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> wrote:

> On Mon, Jun 18, 2007 at 08:46:03PM -0700, Christoph Lameter wrote:
> > @@ -2493,17 +2493,18 @@ static void idle_balance(int this_cpu, s
> >  	unsigned long next_balance = jiffies + 60 *  HZ;
> > 
> >  	for_each_domain(this_cpu, sd) {
> > -		if (sd->flags & SD_BALANCE_NEWIDLE) {
> > +		unsigned long interval;
> > +
> 
> Do we need a :
> 
> 		if (!(sd->flags & SD_LOAD_BALANCE))
> 			continue;
> 
> here?
> 
> Otherwise patch look good and fixes the problem Paul observed earlier.

great! I've applied the patch below (added your fix and cleaned it up a 
bit) and have released 2.6.21.5-rt17 with it.

	Ingo

------------------------------>
From: Christoph Lameter <clameter@sgi.com>
Subject: [patch] sched: fix next_interval determination in idle_balance().

The intervals of domains that do not have SD_BALANCE_NEWIDLE must
be considered for the calculation of the time of the next balance.
Otherwise we may defer rebalancing forever.

Siddha also spotted that the conversion of the balance interval
to jiffies is missing. Fix that to.

From: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>

also continue the loop if !(sd->flags & SD_LOAD_BALANCE).

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 kernel/sched.c |   22 +++++++++++++---------
 1 file changed, 13 insertions(+), 9 deletions(-)

Index: linux/kernel/sched.c
===================================================================
--- linux.orig/kernel/sched.c
+++ linux/kernel/sched.c
@@ -2591,17 +2591,21 @@ static void idle_balance(int this_cpu, s
 	unsigned long next_balance = jiffies + HZ;
 
 	for_each_domain(this_cpu, sd) {
-		if (sd->flags & SD_BALANCE_NEWIDLE) {
+		unsigned long interval;
+
+		if (!(sd->flags & SD_LOAD_BALANCE))
+			continue;
+
+		if (sd->flags & SD_BALANCE_NEWIDLE)
 			/* If we've pulled tasks over stop searching: */
 			pulled_task = load_balance_newidle(this_cpu,
-							this_rq, sd);
-			if (time_after(next_balance,
-				  sd->last_balance + sd->balance_interval))
-				next_balance = sd->last_balance
-						+ sd->balance_interval;
-			if (pulled_task)
-				break;
-		}
+								this_rq, sd);
+
+		interval = msecs_to_jiffies(sd->balance_interval);
+		if (time_after(next_balance, sd->last_balance + interval))
+			next_balance = sd->last_balance + interval;
+		if (pulled_task)
+			break;
 	}
 	if (pulled_task || time_after(jiffies, this_rq->next_balance)) {
 		/*

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: v2.6.21.4-rt11
  2007-06-18 15:12                   ` v2.6.21.4-rt11 Srivatsa Vaddagiri
  2007-06-18 16:54                     ` v2.6.21.4-rt11 Christoph Lameter
  2007-06-18 18:06                     ` v2.6.21.4-rt11 Srivatsa Vaddagiri
@ 2007-06-19  9:04                     ` Ingo Molnar
  2007-06-19 10:43                       ` v2.6.21.4-rt11 Srivatsa Vaddagiri
                                         ` (3 more replies)
  2 siblings, 4 replies; 39+ messages in thread
From: Ingo Molnar @ 2007-06-19  9:04 UTC (permalink / raw)
  To: Srivatsa Vaddagiri
  Cc: Paul E. McKenney, Thomas Gleixner, Dinakar Guniguntala,
	Dmitry Adamushko, suresh.b.siddha, pwil3058, clameter,
	linux-kernel, akpm


* Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> wrote:

> I believe the patch below is correct. With the patch applied, I could 
> not recreate the imbalance with rcutorture. Let me know whether you 
> still see the problem with this patch applied on any other machine.

thanks for tracking this down! I've applied Christoph's patch (with your 
suggested modification plus a few small cleanups).

I'm wondering, why did this trigger under CFS and not on mainline? 
Mainline seems to have a similar problem in idle_balance() too, or am i 
misreading it?

	Ingo

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: v2.6.21.4-rt11
  2007-06-19  9:04                     ` v2.6.21.4-rt11 Ingo Molnar
@ 2007-06-19 10:43                       ` Srivatsa Vaddagiri
  2007-06-19 14:33                       ` v2.6.21.4-rt11 Srivatsa Vaddagiri
                                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 39+ messages in thread
From: Srivatsa Vaddagiri @ 2007-06-19 10:43 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Paul E. McKenney, Thomas Gleixner, Dinakar Guniguntala,
	Dmitry Adamushko, suresh.b.siddha, pwil3058, clameter,
	linux-kernel, akpm

On Tue, Jun 19, 2007 at 11:04:30AM +0200, Ingo Molnar wrote:
> I'm wondering, why did this trigger under CFS and not on mainline? 

I thought Paul had seen the same problem with 2.6.21.5. I will try a
more recent mainline (2.6.22-rc5 maybe) after I get hold of the problem
machine and report later today.

If there is any difference, it should be because of the reported topology
by low-level platform code. In the problem case, each CPU was being reported
to be a separate node (CONFIG_NUMA enabled) which caused idle_balance()
to stop load-balance lookups at cpu/node level itself. 

> Mainline seems to have a similar problem in idle_balance() too, or am i 
> misreading it?

-- 
Regards,
vatsa

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: v2.6.21.4-rt11
  2007-06-19  9:04                     ` v2.6.21.4-rt11 Ingo Molnar
  2007-06-19 10:43                       ` v2.6.21.4-rt11 Srivatsa Vaddagiri
@ 2007-06-19 14:33                       ` Srivatsa Vaddagiri
  2007-06-19 19:15                         ` v2.6.21.4-rt11 Christoph Lameter
  2007-06-19 15:08                       ` v2.6.21.4-rt11 Paul E. McKenney
  2007-06-19 19:14                       ` v2.6.21.4-rt11 Christoph Lameter
  3 siblings, 1 reply; 39+ messages in thread
From: Srivatsa Vaddagiri @ 2007-06-19 14:33 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Paul E. McKenney, Thomas Gleixner, Dinakar Guniguntala,
	Dmitry Adamushko, suresh.b.siddha, pwil3058, clameter,
	linux-kernel, akpm

On Tue, Jun 19, 2007 at 11:04:30AM +0200, Ingo Molnar wrote:
> I'm wondering, why did this trigger under CFS and not on mainline? 
> Mainline seems to have a similar problem in idle_balance() too, or am i 
> misreading it?

The problem is there in mainline very much. I could recreate the problem
with 2.6.22-rc5 (which doesnt have CFS) on that same hardware, with
CONFIG_NUMA enabled.

Let me know if you needed anything else to be clarified.

-- 
Regards,
vatsa

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: v2.6.21.4-rt11
  2007-06-19  9:04                     ` v2.6.21.4-rt11 Ingo Molnar
  2007-06-19 10:43                       ` v2.6.21.4-rt11 Srivatsa Vaddagiri
  2007-06-19 14:33                       ` v2.6.21.4-rt11 Srivatsa Vaddagiri
@ 2007-06-19 15:08                       ` Paul E. McKenney
  2007-06-19 19:14                       ` v2.6.21.4-rt11 Christoph Lameter
  3 siblings, 0 replies; 39+ messages in thread
From: Paul E. McKenney @ 2007-06-19 15:08 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Srivatsa Vaddagiri, Thomas Gleixner, Dinakar Guniguntala,
	Dmitry Adamushko, suresh.b.siddha, pwil3058, clameter,
	linux-kernel, akpm

On Tue, Jun 19, 2007 at 11:04:30AM +0200, Ingo Molnar wrote:
> 
> * Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> wrote:
> 
> > I believe the patch below is correct. With the patch applied, I could 
> > not recreate the imbalance with rcutorture. Let me know whether you 
> > still see the problem with this patch applied on any other machine.
> 
> thanks for tracking this down! I've applied Christoph's patch (with your 
> suggested modification plus a few small cleanups).
> 
> I'm wondering, why did this trigger under CFS and not on mainline? 
> Mainline seems to have a similar problem in idle_balance() too, or am i 
> misreading it?

It did in fact trigger under all three of mainline, CFS, and -rt including
CFS -- see below for a couple of emails from last Friday giving results
for these three on the AMD box (where it happened) and on a single-quad
NUMA-Q system (where it did not, at least not with such severity).

That said, there certainly was a time when neither mainline nor -rt
acted this way!

						Thanx, Paul

------------------------------------------------------------------------

Date: Fri, 15 Jun 2007 13:06:17 -0700
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Dinakar Guniguntala <dino@in.ibm.com>
Subject: Re: v2.6.21.4-rt11

On Fri, Jun 15, 2007 at 08:14:52AM -0700, Paul E. McKenney wrote:
> On Fri, Jun 15, 2007 at 04:45:35PM +0200, Ingo Molnar wrote:
> > 
> > Paul,
> > 
> > do you still see the load-distribution problem with -rt14? (which 
> > includes cfsv17) Or rather ... could you try vanilla cfsv17 instead:
> > 
> >    http://people.redhat.com/mingo/cfs-scheduler/
> > 
> > to make sure it's not some effect in -rt causing this. v17 has an 
> > updated load balancing code. (which might or might not affect the 
> > rcutorture problem.)

No joy, see below.  Strangely hardware dependent.  My next step, left
to myself, would be to patch rcutorture.c to cause the readers to dump
the CFS state information every ten seconds or so.  My guess is that
the important per-task stuff is:

	current->sched_info.pcnt
	current->sched_info.cpu_time
	current->sched_info.run_delay
	current->sched_info.last_arrival
	current->sched_info.last_queued

And maybe the runqueue info dumped out by show_schedstat, this last
via new per-CPU tasks.

Other thoughts?

						Thanx, Paul

> Good point!  I will try the following:
> 
> 1.	Stock 2.6.21.5.  64-bit kernel on AMD Opterons.

All eight readers end up on the same CPU, CPU 1 in this case.  And they
stay there (ten minutes).

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND           
 3058 root      39  19     0    0    0 R 12.7  0.0   0:06.91 rcu_torture_rea   
 3059 root      39  19     0    0    0 R 12.7  0.0   0:06.91 rcu_torture_rea   
 3060 root      39  19     0    0    0 R 12.7  0.0   0:06.91 rcu_torture_rea   
 3061 root      39  19     0    0    0 R 12.7  0.0   0:06.91 rcu_torture_rea   
 3062 root      39  19     0    0    0 R 12.7  0.0   0:06.91 rcu_torture_rea   
 3063 root      39  19     0    0    0 R 12.7  0.0   0:06.91 rcu_torture_rea   
 3057 root      39  19     0    0    0 R 12.3  0.0   0:06.91 rcu_torture_rea   
 3064 root      39  19     0    0    0 R 12.3  0.0   0:06.91 rcu_torture_rea   

> 1.	Stock 2.6.21.5.  32-bit kernel on NUMA-Q.

Works just fine(!).

> 2.	2.6.21-rt14.  64-bit kernel on AMD Opterons.

All eight readers are spread, but over only two CPUs (0 and 3, in this
case).  Persists, usually with 4/4 split, but sometimes with five
tasks on one CPU and three on the other.

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND           
 3111 root      39  19     0    0    0 R 23.9  0.0   0:27.27 rcu_torture_rea   
 3114 root      39  19     0    0    0 R 23.9  0.0   0:28.58 rcu_torture_rea   
 3117 root      39  19     0    0    0 R 23.9  0.0   0:32.40 rcu_torture_rea   
 3112 root      39  19     0    0    0 R 23.6  0.0   0:28.41 rcu_torture_rea   
 3110 root      39  19     0    0    0 R 22.9  0.0   0:43.46 rcu_torture_rea   
 3113 root      39  19     0    0    0 R 22.9  0.0   0:27.28 rcu_torture_rea   
 3115 root      39  19     0    0    0 R 22.9  0.0   0:33.08 rcu_torture_rea   
 3116 root      39  19     0    0    0 R 22.6  0.0   0:28.10 rcu_torture_rea   

elm3b6:~# for ((i=3110;i<=3117;i++)); do cat /proc/$i/stat | awk '{print $(NF-3)}'; done
3 3 0 3 0 0 0 3

> 2.	2.6.21-rt14.  32-bit kernel on NUMA-Q.

Works just fine.

> 3.	2.6.21.5 + sched-cfs-v2.6.21.5-v17.patch on 64-bit kernel on
	AMD Opteron.

All eight readers end up on the same CPU, CPU 2 in this case.  And they
stay there (ten minutes).

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND           
 3081 root      39  19     0    0    0 R 11.3  0.0   1:31.77 rcu_torture_rea   
 3082 root      39  19     0    0    0 R 11.3  0.0   1:31.77 rcu_torture_rea   
 3085 root      39  19     0    0    0 R 11.3  0.0   1:31.78 rcu_torture_rea   
 3079 root      39  19     0    0    0 R 11.0  0.0   1:31.72 rcu_torture_rea   
 3080 root      39  19     0    0    0 R 11.0  0.0   1:31.76 rcu_torture_rea   
 3083 root      39  19     0    0    0 R 11.0  0.0   1:31.76 rcu_torture_rea   
 3084 root      39  19     0    0    0 R 11.0  0.0   1:31.77 rcu_torture_rea   
 3086 root      39  19     0    0    0 R 11.0  0.0   1:31.75 rcu_torture_rea   

Using "taskset" to pin each process to a pair of CPUs (masks 0x3, 0x6,
0xc, and 0x9) forces them to CPUs 0 and 2 -- previously this had spread
them nicely.  So I kept pinning tasks to single CPUs (which defeats
some rcutorture testing) until they did spread, getting the following:

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND           
 3079 root      39  19     0    0    0 R 49.9  0.0   3:18.91 rcu_torture_rea   
 3080 root      39  19     0    0    0 R 49.6  0.0   3:15.76 rcu_torture_rea   
 3086 root      39  19     0    0    0 R 49.6  0.0   3:48.82 rcu_torture_rea   
 3083 root      39  19     0    0    0 R 49.3  0.0   2:58.02 rcu_torture_rea   
 3084 root      39  19     0    0    0 R 48.6  0.0   3:00.54 rcu_torture_rea   
 3081 root      39  19     0    0    0 R 47.9  0.0   3:00.55 rcu_torture_rea   
 3082 root      39  19     0    0    0 R 44.6  0.0   3:18.89 rcu_torture_rea   
 3085 root      39  19     0    0    0 R 44.3  0.0   3:07.11 rcu_torture_rea   

elm3b6:~# for ((i=3079;i<=3086;i++)); do cat /proc/$i/stat | awk '{print $(NF-3)}'; done
0 0 2 1 3 2 1 3

> 3.	2.6.21.5 + sched-cfs-v2.6.21.5-v17.patch on 32-bit kernel on
	NUMA-Q.

Some imbalance:

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND           
 2263 root      39  19     0    0    0 R 92.4  0.0   2:19.69 rcu_torture_rea   
 2265 root      39  19     0    0    0 R 49.8  0.0   1:41.84 rcu_torture_rea   
 2264 root      39  19     0    0    0 R 49.5  0.0   2:11.69 rcu_torture_rea   
 2261 root      39  19     0    0    0 R 48.8  0.0   2:09.95 rcu_torture_rea   
 2262 root      39  19     0    0    0 R 48.8  0.0   3:01.42 rcu_torture_rea   
 2266 root      39  19     0    0    0 R 30.1  0.0   1:47.02 rcu_torture_rea   
 2260 root      39  19     0    0    0 R 29.8  0.0   2:10.07 rcu_torture_rea   
 2267 root      39  19     0    0    0 R 29.8  0.0   1:57.34 rcu_torture_rea   

elm3b132:~# for ((i=2260;i<=2267;i++)); do cat /proc/$i/stat | awk '{print $(NF-3)}'; done
0 1 3 1 2 2 2 0

Has persisted (with some shuffling of CPUs, see below) for about five
minutes, will let it run for an hour or so to see if it is really serious
about this.

3 0 0 2 1 0 2 3 

------------------------------------------------------------------------

Date: Fri, 15 Jun 2007 15:00:17 -0700
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Dinakar Guniguntala <dino@in.ibm.com>,
	Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>,
	Dmitry Adamushko <dmitry.adamushko@gmail.com>
Subject: Re: v2.6.21.4-rt11

On Fri, Jun 15, 2007 at 10:35:39PM +0200, Ingo Molnar wrote:
> 
> (forwarding Paul's mail below to other CFS hackers too.)
> 
> ------------>

[ . . . ]

> > 3.	2.6.21.5 + sched-cfs-v2.6.21.5-v17.patch on 32-bit kernel on
> 	NUMA-Q.
> 
> Some imbalance:
> 
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>  2263 root      39  19     0    0    0 R 92.4  0.0   2:19.69 rcu_torture_rea
>  2265 root      39  19     0    0    0 R 49.8  0.0   1:41.84 rcu_torture_rea
>  2264 root      39  19     0    0    0 R 49.5  0.0   2:11.69 rcu_torture_rea
>  2261 root      39  19     0    0    0 R 48.8  0.0   2:09.95 rcu_torture_rea
>  2262 root      39  19     0    0    0 R 48.8  0.0   3:01.42 rcu_torture_rea
>  2266 root      39  19     0    0    0 R 30.1  0.0   1:47.02 rcu_torture_rea
>  2260 root      39  19     0    0    0 R 29.8  0.0   2:10.07 rcu_torture_rea
>  2267 root      39  19     0    0    0 R 29.8  0.0   1:57.34 rcu_torture_rea
> 
> elm3b132:~# for ((i=2260;i<=2267;i++)); do cat /proc/$i/stat | awk '{print $(NF-3)}'; done
> 0 1 3 1 2 2 2 0
> 
> Has persisted (with some shuffling of CPUs, see below) for about five
> minutes, will let it run for an hour or so to see if it is really serious
> about this.
> 
> 3 0 0 2 1 0 2 3 

And when I returned after an hour, it had straightened itself out:
1 3 1 2 2 0 3 0

The 64-bit AMD 4-CPU machines have not straightened themselves out
in the past, but will try an extended run over the weekend to see
if load balancing is just a bit on the slow side.  ;-)

But got distracted for an additional hour, and it is imbalanced again:

2 1 2 0 1 1 3 3

Strange...

						Thanx, Paul

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: v2.6.21.4-rt11
  2007-06-19  9:04                     ` v2.6.21.4-rt11 Ingo Molnar
                                         ` (2 preceding siblings ...)
  2007-06-19 15:08                       ` v2.6.21.4-rt11 Paul E. McKenney
@ 2007-06-19 19:14                       ` Christoph Lameter
  3 siblings, 0 replies; 39+ messages in thread
From: Christoph Lameter @ 2007-06-19 19:14 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Srivatsa Vaddagiri, Paul E. McKenney, Thomas Gleixner,
	Dinakar Guniguntala, Dmitry Adamushko, suresh.b.siddha, pwil3058,
	linux-kernel, akpm

On Tue, 19 Jun 2007, Ingo Molnar wrote:

> I'm wondering, why did this trigger under CFS and not on mainline? 
> Mainline seems to have a similar problem in idle_balance() too, or am i 
> misreading it?

Right. The patch needs to go into mainline as well.


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: v2.6.21.4-rt11
  2007-06-19 14:33                       ` v2.6.21.4-rt11 Srivatsa Vaddagiri
@ 2007-06-19 19:15                         ` Christoph Lameter
  0 siblings, 0 replies; 39+ messages in thread
From: Christoph Lameter @ 2007-06-19 19:15 UTC (permalink / raw)
  To: Srivatsa Vaddagiri
  Cc: Ingo Molnar, Paul E. McKenney, Thomas Gleixner,
	Dinakar Guniguntala, Dmitry Adamushko, suresh.b.siddha, pwil3058,
	linux-kernel, akpm

On Tue, 19 Jun 2007, Srivatsa Vaddagiri wrote:

> On Tue, Jun 19, 2007 at 11:04:30AM +0200, Ingo Molnar wrote:
> > I'm wondering, why did this trigger under CFS and not on mainline? 
> > Mainline seems to have a similar problem in idle_balance() too, or am i 
> > misreading it?
> 
> The problem is there in mainline very much. I could recreate the problem
> with 2.6.22-rc5 (which doesnt have CFS) on that same hardware, with
> CONFIG_NUMA enabled.
> 
> Let me know if you needed anything else to be clarified.

This is a bugfix that needs to go into 2.6.22.


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: v2.6.21.4-rt11
  2007-06-18 16:14         ` v2.6.21.4-rt11 Katsuya MATSUBARA
@ 2007-06-19  4:04           ` Thomas Gleixner
  0 siblings, 0 replies; 39+ messages in thread
From: Thomas Gleixner @ 2007-06-19  4:04 UTC (permalink / raw)
  To: Katsuya MATSUBARA; +Cc: nelsoneci, mingo, linux-kernel, linux-rt-users

Katsuya-San,

On Tue, 2007-06-19 at 01:14 +0900, Katsuya MATSUBARA wrote:
> > It lacks support for the generic timeofday and clock event layers, which
> > causes the compile breakage.
> 
>  I am working on Renesas SuperH platforms.
>  I faced the similar compile errors
>  because 2.6.21.X in SH does not support GENERIC_TIME yet.
>  I made a workaround patch. Is this correct?

Looks good.

Can you in future please use "diff -ur" to generate patches. That's the
usual format.

Thanks,

	tglx



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: v2.6.21.4-rt11
  2007-06-17 16:59       ` v2.6.21.4-rt11 Thomas Gleixner
@ 2007-06-18 16:14         ` Katsuya MATSUBARA
  2007-06-19  4:04           ` v2.6.21.4-rt11 Thomas Gleixner
  0 siblings, 1 reply; 39+ messages in thread
From: Katsuya MATSUBARA @ 2007-06-18 16:14 UTC (permalink / raw)
  To: tglx; +Cc: nelsoneci, mingo, linux-kernel, linux-rt-users


From: Thomas Gleixner <tglx@linutronix.de>
Date: Sun, 17 Jun 2007 18:59:18 +0200

> On Sun, 2007-06-17 at 11:49 -0500, Nelson Castillo wrote:
> > > > There are many choices and
> > > > I don't know what is the more friendly. By friendly I mean the one that
> > > > is likely to be merged and that cooperate with you.
> > >
> > > Which choices do you mean ?
> > 
> > I mean implementations. I've seen lot of them but i don't know which one
> > to try (I'm new to RT and the implementation in this thread seems to
> > be very nice).
> 
> Thanks :)
> 
> > > >   http://people.redhat.com/mingo/realtime-preempt/patch-2.6.21.4-rt14
> > > >
> > > > : undefined reference to `usecs_to_cycles'
> > > > make: *** [.tmp_vmlinux1] Error 1
> > >
> > > Which ARM sub arch ?
> > 
> > sub arch AT91 -- (Atmel AT91RM9200 processor).
> 
> It lacks support for the generic timeofday and clock event layers, which
> causes the compile breakage.

 I am working on Renesas SuperH platforms.
 I faced the similar compile errors
 because 2.6.21.X in SH does not support GENERIC_TIME yet.
 I made a workaround patch. Is this correct?

 Thanks,
---
 Katsuya Matsubara @ Igel Co., Ltd
 matsu@igel.co.jp
 
 diff -cr linux-2.6.21.5-rt14/kernel/hrtimer.c linux-2.6.21.5-rt14-nogt/kernel/hrtimer.c
*** linux-2.6.21.5-rt14/kernel/hrtimer.c	2007-06-18 19:55:55.000000000 +0900
--- linux-2.6.21.5-rt14-nogt/kernel/hrtimer.c	2007-06-16 16:36:10.000000000 +0900
***************
*** 119,127 ****
--- 119,131 ----
  
  	do {
  		seq = read_seqbegin(&xtime_lock);
+ #ifdef CONFIG_GENERIC_TIME
  		*ts = xtime;
  		nsecs = __get_nsec_offset();
  		timespec_add_ns(ts, nsecs);
+ #else
+ 		getnstimeofday(ts);
+ #endif
  		tomono = wall_to_monotonic;
  
  	} while (read_seqretry(&xtime_lock, seq));
diff -cr linux-2.6.21.5-rt14/kernel/time/ntp.c linux-2.6.21.5-rt14-nogt/kernel/time/ntp.c
*** linux-2.6.21.5-rt14/kernel/time/ntp.c	2007-06-18 19:55:56.000000000 +0900
--- linux-2.6.21.5-rt14-nogt/kernel/time/ntp.c	2007-06-16 16:37:46.000000000 +0900
***************
*** 120,126 ****
--- 120,128 ----
  			 */
  			time_interpolator_update(-NSEC_PER_SEC);
  			time_state = TIME_OOP;
+ #ifdef CONFIG_GENERIC_TIME
  			warp_check_clock_was_changed();
+ #endif
  			clock_was_set();
  			printk(KERN_NOTICE "Clock: inserting leap second "
  					"23:59:60 UTC\n");
***************
*** 136,142 ****
--- 138,146 ----
  			 */
  			time_interpolator_update(NSEC_PER_SEC);
  			time_state = TIME_WAIT;
+ #ifdef CONFIG_GENERIC_TIME
  			warp_check_clock_was_changed();
+ #endif
  			clock_was_set();
  			printk(KERN_NOTICE "Clock: deleting leap second "
  					"23:59:59 UTC\n");
diff -cr linux-2.6.21.5-rt14/kernel/time.c linux-2.6.21.5-rt14-nogt/kernel/time.c
*** linux-2.6.21.5-rt14/kernel/time.c	2007-06-18 19:55:56.000000000 +0900
--- linux-2.6.21.5-rt14-nogt/kernel/time.c	2007-06-16 16:36:10.000000000 +0900
***************
*** 135,141 ****
--- 135,143 ----
  	wall_to_monotonic.tv_sec -= sys_tz.tz_minuteswest * 60;
  	xtime.tv_sec += sys_tz.tz_minuteswest * 60;
  	time_interpolator_reset();
+ #ifdef CONFIG_GENERIC_TIME
  	warp_check_clock_was_changed();
+ #endif
  	write_sequnlock_irq(&xtime_lock);
  	clock_was_set();
  }
***************
*** 320,326 ****
--- 322,330 ----
  		time_esterror = NTP_PHASE_LIMIT;
  		time_interpolator_reset();
  	}
+ #ifdef CONFIG_GENERIC_TIME
  	warp_check_clock_was_changed();
+ #endif
  	write_sequnlock_irq(&xtime_lock);
  	clock_was_set();
  	return 0;
diff -cr linux-2.6.21.5-rt14/kernel/timer.c linux-2.6.21.5-rt14-nogt/kernel/timer.c
*** linux-2.6.21.5-rt14/kernel/timer.c	2007-06-18 19:55:56.000000000 +0900
--- linux-2.6.21.5-rt14-nogt/kernel/timer.c	2007-06-16 16:36:10.000000000 +0900
***************
*** 1165,1171 ****
--- 1165,1173 ----
  	clock->cycle_accumulated = 0;
  	clock->error = 0;
  	timekeeping_suspended = 0;
+ #ifdef CONFIG_GENERIC_TIME
  	warp_check_clock_was_changed();
+ #endif
  	write_sequnlock_irqrestore(&xtime_lock, flags);
  
  	touch_softlockup_watchdog();
***************
*** 1728,1736 ****
--- 1730,1742 ----
  		 * too.
  		 */
  
+ #ifdef CONFIG_GENERIC_TIME
  		tp = xtime;
  		nsecs = __get_nsec_offset();
  		timespec_add_ns(&tp, nsecs);
+ #else
+ 		getnstimeofday(&tp);
+ #endif
  
  		tp.tv_sec += wall_to_monotonic.tv_sec;
  		tp.tv_nsec += wall_to_monotonic.tv_nsec;


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: v2.6.21.4-rt11
  2007-06-17 16:49     ` v2.6.21.4-rt11 Nelson Castillo
@ 2007-06-17 16:59       ` Thomas Gleixner
  2007-06-18 16:14         ` v2.6.21.4-rt11 Katsuya MATSUBARA
  0 siblings, 1 reply; 39+ messages in thread
From: Thomas Gleixner @ 2007-06-17 16:59 UTC (permalink / raw)
  To: Nelson Castillo; +Cc: Ingo Molnar, linux-kernel, linux-rt-users

On Sun, 2007-06-17 at 11:49 -0500, Nelson Castillo wrote:
> > > There are many choices and
> > > I don't know what is the more friendly. By friendly I mean the one that
> > > is likely to be merged and that cooperate with you.
> >
> > Which choices do you mean ?
> 
> I mean implementations. I've seen lot of them but i don't know which one
> to try (I'm new to RT and the implementation in this thread seems to
> be very nice).

Thanks :)

> > >   http://people.redhat.com/mingo/realtime-preempt/patch-2.6.21.4-rt14
> > >
> > > : undefined reference to `usecs_to_cycles'
> > > make: *** [.tmp_vmlinux1] Error 1
> >
> > Which ARM sub arch ?
> 
> sub arch AT91 -- (Atmel AT91RM9200 processor).

It lacks support for the generic timeofday and clock event layers, which
causes the compile breakage.

I take a look at the compile errors and ping somebody who is working on
support for AT91 to send out the patches ASAP.

	tglx



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: v2.6.21.4-rt11
  2007-06-17 16:43   ` v2.6.21.4-rt11 Thomas Gleixner
@ 2007-06-17 16:49     ` Nelson Castillo
  2007-06-17 16:59       ` v2.6.21.4-rt11 Thomas Gleixner
  0 siblings, 1 reply; 39+ messages in thread
From: Nelson Castillo @ 2007-06-17 16:49 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Ingo Molnar, linux-kernel, linux-rt-users

On 6/17/07, Thomas Gleixner <tglx@linutronix.de> wrote:
> On Sun, 2007-06-17 at 11:15 -0500, Nelson Castillo wrote:
> > >      http://rt.wiki.kernel.org
> >
> > Not for ARM yet :(
> >
> > What should I try for the ARM architecture?
>
> ARM has a lot of sub architectures and not all of them are supported
> yet.

I see.

> > There are many choices and
> > I don't know what is the more friendly. By friendly I mean the one that
> > is likely to be merged and that cooperate with you.
>
> Which choices do you mean ?

I mean implementations. I've seen lot of them but i don't know which one
to try (I'm new to RT and the implementation in this thread seems to
be very nice).

>
> >   http://people.redhat.com/mingo/realtime-preempt/patch-2.6.21.4-rt14
> >
> > : undefined reference to `usecs_to_cycles'
> > make: *** [.tmp_vmlinux1] Error 1
>
> Which ARM sub arch ?

sub arch AT91 -- (Atmel AT91RM9200 processor).

Thanks,
Nelson.-


-- 
http://arhuaco.org
http://emQbit.com

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: v2.6.21.4-rt11
  2007-06-17 16:15 ` v2.6.21.4-rt11 Nelson Castillo
@ 2007-06-17 16:43   ` Thomas Gleixner
  2007-06-17 16:49     ` v2.6.21.4-rt11 Nelson Castillo
  0 siblings, 1 reply; 39+ messages in thread
From: Thomas Gleixner @ 2007-06-17 16:43 UTC (permalink / raw)
  To: Nelson Castillo; +Cc: Ingo Molnar, linux-kernel, linux-rt-users

On Sun, 2007-06-17 at 11:15 -0500, Nelson Castillo wrote:
> >      http://rt.wiki.kernel.org
> 
> Not for ARM yet :(
> 
> What should I try for the ARM architecture? 

ARM has a lot of sub architectures and not all of them are supported
yet.

> There are many choices and
> I don't know what is the more friendly. By friendly I mean the one that
> is likely to be merged and that cooperate with you.

Which choices do you mean ?

>   http://people.redhat.com/mingo/realtime-preempt/patch-2.6.21.4-rt14
> 
> : undefined reference to `usecs_to_cycles'
> make: *** [.tmp_vmlinux1] Error 1

Which ARM sub arch ?

	tglx



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: v2.6.21.4-rt11
  2007-06-09 21:05 v2.6.21.4-rt11 Ingo Molnar
  2007-06-11  1:19 ` v2.6.21.4-rt11 Paul E. McKenney
  2007-06-12  6:03 ` v2.6.21.4-rt11 Eric St-Laurent
@ 2007-06-17 16:15 ` Nelson Castillo
  2007-06-17 16:43   ` v2.6.21.4-rt11 Thomas Gleixner
  2 siblings, 1 reply; 39+ messages in thread
From: Nelson Castillo @ 2007-06-17 16:15 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, linux-rt-users

On 6/9/07, Ingo Molnar <mingo@elte.hu> wrote:
>
> i'm pleased to announce the v2.6.21.4-rt11 kernel, which can be
> downloaded from the usual place:
>
>      http://people.redhat.com/mingo/realtime-preempt/
>
> more info about the -rt patchset can be found in the RT wiki:
>
>      http://rt.wiki.kernel.org

Not for ARM yet :(

What should I try for the ARM architecture? There are many choices and
I don't know what is the more friendly. By friendly I mean the one that
is likely to be merged and that cooperate with you.

> to build a 2.6.21.4-rt11 tree, the following patches should be applied:
>
>     http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.21.4.tar.bz2
>     http://people.redhat.com/mingo/realtime-preempt/patch-2.6.21.4-rt11
>


Using this patch:

  http://people.redhat.com/mingo/realtime-preempt/patch-2.6.21.4-rt14

  LD      init/built-in.o
  LD      .tmp_vmlinux1
kernel/built-in.o(.text+0xd3f0): In function `do_sys_settimeofday':
: undefined reference to `warp_check_clock_was_changed'
kernel/built-in.o(.text+0x12588): In function `timekeeping_resume':
: undefined reference to `warp_check_clock_was_changed'
kernel/built-in.o(.text+0x132f8): In function `do_sysinfo':
: undefined reference to `__get_nsec_offset'
kernel/built-in.o(.text+0x20a04): In function `ktime_get_ts':
: undefined reference to `__get_nsec_offset'
kernel/built-in.o(.text+0x221c0): In function `$a':
: undefined reference to `warp_check_clock_was_changed'
kernel/built-in.o(.text+0x22208): In function `$a':
: undefined reference to `warp_check_clock_was_changed'
kernel/built-in.o(.text+0x2b9dc): In function `$a':
: undefined reference to `usecs_to_cycles'
make: *** [.tmp_vmlinux1] Error 1

Regards.

-- 
http://arhuaco.org
http://emQbit.com

^ permalink raw reply	[flat|nested] 39+ messages in thread

* RE: v2.6.21.4-rt11
  2007-06-12 13:00     ` v2.6.21.4-rt11 Pallipadi, Venkatesh
@ 2007-06-13  1:37       ` Eric St-Laurent
  0 siblings, 0 replies; 39+ messages in thread
From: Eric St-Laurent @ 2007-06-13  1:37 UTC (permalink / raw)
  To: Pallipadi, Venkatesh
  Cc: Ingo Molnar, linux-kernel, linux-rt-users, Thomas Gleixner,
	Dinakar Guniguntala

On Tue, 2007-12-06 at 06:00 -0700, Pallipadi, Venkatesh wrote:
> 
> >-----Original Message-----

> Yes. Force_hpet part is should have worked..
> Eric: Can you send me the output of 'lspci -n on your system.
> We need to double check we are covering all ICH7 ids.

Here it is:

00:00.0 0600: 8086:2770 (rev 02)
00:02.0 0300: 8086:2772 (rev 02)
00:1b.0 0403: 8086:27d8 (rev 01)
00:1c.0 0604: 8086:27d0 (rev 01)
00:1c.1 0604: 8086:27d2 (rev 01)
00:1d.0 0c03: 8086:27c8 (rev 01)
00:1d.1 0c03: 8086:27c9 (rev 01)
00:1d.2 0c03: 8086:27ca (rev 01)
00:1d.3 0c03: 8086:27cb (rev 01)
00:1d.7 0c03: 8086:27cc (rev 01)
00:1e.0 0604: 8086:244e (rev e1)
00:1f.0 0601: 8086:27b8 (rev 01)
00:1f.1 0101: 8086:27df (rev 01)
00:1f.2 0101: 8086:27c0 (rev 01)
00:1f.3 0c05: 8086:27da (rev 01)
01:0a.0 0604: 3388:0021 (rev 11)
02:0c.0 0c03: 1033:0035 (rev 41)
02:0c.1 0c03: 1033:0035 (rev 41)
02:0c.2 0c03: 1033:00e0 (rev 02)
02:0d.0 0c00: 1106:3044 (rev 46)
03:00.0 0200: 8086:109a

Adding the id for PCI_DEVICE_ID_INTEL_ICH7_0 (27b8) should do the trick.

I've patched my kernel and was ready to test it, but in the meantime I
did a BIOS upgrade (bad idea...) and with the new version the HPET timer
is detected via ACPI.

Unfortunately it seems that downgrading the BIOS is a lot more trouble
than upgrading it. So I cannot easily test the force enable anymore.

Anyway it works now. Here is my patch if it's any use to you:


diff -uprN linux-2.6.21.4.orig/arch/i386/kernel/quirks.c linux-2.6.21.4/arch/i386/kernel/quirks.c
--- linux-2.6.21.4.orig/arch/i386/kernel/quirks.c	Tue Jun 12 10:03:18 2007
+++ linux-2.6.21.4/arch/i386/kernel/quirks.c	Tue Jun 12 10:08:02 2007
@@ -149,6 +149,8 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_I
                          ich_force_enable_hpet);
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ICH6_1,
                          ich_force_enable_hpet);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ICH7_0,
+                         ich_force_enable_hpet);
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ICH7_1,
                          ich_force_enable_hpet);
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ICH7_31,


Best regards,

- Eric



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: v2.6.21.4-rt11
  2007-06-12 21:37                 ` v2.6.21.4-rt11 Ingo Molnar
@ 2007-06-13  1:27                   ` Paul E. McKenney
  0 siblings, 0 replies; 39+ messages in thread
From: Paul E. McKenney @ 2007-06-13  1:27 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-rt-users, Thomas Gleixner, Dinakar Guniguntala

On Tue, Jun 12, 2007 at 11:37:58PM +0200, Ingo Molnar wrote:
> 
> * Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> 
> > Not a biggie for me, since I can easily do the taskset commands to 
> > force the processes to spread out, but I am worried that casual users 
> > of rcutorture won't know to do this -- thus not really torturing RCU. 
> > It would not be hard to modify rcutorture to affinity the tasks so as 
> > to spread them, but this seems a bit ugly.
> 
> does it get any better if you renice them from +19 to 0? (and then back 
> to +19?)

Interesting!

That did spread them evenly across two CPUs, but not across all four.

I took a look at CFS, which seems to operate in terms of milliseconds.
Since the rcu_torture_reader() code enters the scheduler on each
interation, it would not give CFS millisecond-scale bursts of CPU
consumption, perhaps not allowing it to do reasonable load balancing.

So I inserted the following code at the beginning of rcu_torture_reader():

	set_user_nice(current, 19);
	set_user_nice(current, 0);
	for (idx = 0; idx < 1000; idx++) {
		udelay(10);
	}
	set_user_nice(current, 19);

This worked much better:

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND           
18600 root      39  19     0    0    0 R   50  0.0   0:09.57 rcu_torture_rea    
18599 root      39  19     0    0    0 R   50  0.0   0:09.56 rcu_torture_rea    
18598 root      39  19     0    0    0 R   49  0.0   0:10.33 rcu_torture_rea    
18602 root      39  19     0    0    0 R   49  0.0   0:10.34 rcu_torture_rea    
18596 root      39  19     0    0    0 R   47  0.0   0:09.48 rcu_torture_rea    
18601 root      39  19     0    0    0 R   46  0.0   0:09.56 rcu_torture_rea    
18595 root      39  19     0    0    0 R   45  0.0   0:09.23 rcu_torture_rea    
18597 root      39  19     0    0    0 R   44  0.0   0:10.92 rcu_torture_rea    
18590 root      39  19     0    0    0 R   10  0.0   0:02.23 rcu_torture_wri    
18591 root      39  19     0    0    0 D    2  0.0   0:00.34 rcu_torture_fak    
18592 root      39  19     0    0    0 D    2  0.0   0:00.35 rcu_torture_fak    
18593 root      39  19     0    0    0 D    2  0.0   0:00.35 rcu_torture_fak    
18594 root      39  19     0    0    0 D    2  0.0   0:00.33 rcu_torture_fak    
18603 root      15  -5     0    0    0 S    1  0.0   0:00.06 rcu_torture_sta    

(The first eight tasks are readers, while the last six tasks are update
and statistics threads that don't consume so much CPU, so the above is
pretty close to optimal.)

I stopped and restarted rcutorture several times, and it spread nicely
each time, at least aside from the time that makewhatis decided to fire
up just as I started rcutorture.

But this is admittedly a -very- crude hack.

One approach would be to make them all spin until a few milliseconds
after the last one was created.  I would like to spread the readers
separately from the other tasks, which could be done by taking a two-stage
approach, spreading the writer and fakewriter tasks first, then spreading
the readers.  This seems a bit nicer, and I will play with it a bit.

In the meantime, thoughts on more-maintainable ways of making this work?

						Thanx, Paul

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: v2.6.21.4-rt11
  2007-06-11 22:18               ` v2.6.21.4-rt11 Paul E. McKenney
@ 2007-06-12 21:37                 ` Ingo Molnar
  2007-06-13  1:27                   ` v2.6.21.4-rt11 Paul E. McKenney
  0 siblings, 1 reply; 39+ messages in thread
From: Ingo Molnar @ 2007-06-12 21:37 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, linux-rt-users, Thomas Gleixner, Dinakar Guniguntala


* Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:

> Not a biggie for me, since I can easily do the taskset commands to 
> force the processes to spread out, but I am worried that casual users 
> of rcutorture won't know to do this -- thus not really torturing RCU. 
> It would not be hard to modify rcutorture to affinity the tasks so as 
> to spread them, but this seems a bit ugly.

does it get any better if you renice them from +19 to 0? (and then back 
to +19?)

	Ingo

^ permalink raw reply	[flat|nested] 39+ messages in thread

* RE: v2.6.21.4-rt11
  2007-06-12  7:32   ` v2.6.21.4-rt11 Ingo Molnar
@ 2007-06-12 13:00     ` Pallipadi, Venkatesh
  2007-06-13  1:37       ` v2.6.21.4-rt11 Eric St-Laurent
  0 siblings, 1 reply; 39+ messages in thread
From: Pallipadi, Venkatesh @ 2007-06-12 13:00 UTC (permalink / raw)
  To: Ingo Molnar, Eric St-Laurent
  Cc: linux-kernel, linux-rt-users, Thomas Gleixner, Dinakar Guniguntala

 

>-----Original Message-----
>From: Ingo Molnar [mailto:mingo@elte.hu] 
>Sent: Tuesday, June 12, 2007 12:32 AM
>To: Eric St-Laurent
>Cc: linux-kernel@vger.kernel.org; 
>linux-rt-users@vger.kernel.org; Thomas Gleixner; Dinakar 
>Guniguntala; Pallipadi, Venkatesh
>Subject: Re: v2.6.21.4-rt11
>
>
>(Cc:-ed Venki for the force-hpet issue below)
>
>* Eric St-Laurent <ericstl34@sympatico.ca> wrote:
>
>> On Sat, 2007-09-06 at 23:05 +0200, Ingo Molnar wrote:
>> > i'm pleased to announce the v2.6.21.4-rt11 kernel, which can be 
>> > downloaded from the usual place:
>> 
>> I'm running 2.6.21.4-rt12-cfs-v17 (x86_64), so far no 
>problems. I like 
>> this kernel a lot, it's feels quite smooth.
>
>yeah, that's probably CFS in the works :-) That combined with 
>PREEMPT_RT 
>makes for a really snappy desktop.
>
>> One little thing, no HPET timer is detected. By looking at 
>the patch, 
>> even the force detect code is there, it should work.
>> 
>> The hpet timer is not available as a clocksource and only one hpet
>> related message is present in dmesg:
>> 
>> PM: Adding info for No Bus:hpet
>> 
>> This is on a Asus P5LD2-VM motherboard (ICH7)
>> 
>> Relevant config bits:
>> 
>> CONFIG_HPET_TIMER=y
>> # CONFIG_HPET_EMULATE_RTC is not set
>> CONFIG_HPET=y
>> # CONFIG_HPET_RTC_IRQ is not set
>> CONFIG_HPET_MMAP=y
>> 
>> Should I enable one of the two other options? Any ideas?
>
>Venki, is this ICH7 board supposed to work with force-hpet?
>

Yes. Force_hpet part is should have worked..
Eric: Can you send me the output of 'lspci -n on your system.
We need to double check we are covering all ICH7 ids.

Thanks,
Venki

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: v2.6.21.4-rt11
  2007-06-12  6:03 ` v2.6.21.4-rt11 Eric St-Laurent
@ 2007-06-12  7:32   ` Ingo Molnar
  2007-06-12 13:00     ` v2.6.21.4-rt11 Pallipadi, Venkatesh
  0 siblings, 1 reply; 39+ messages in thread
From: Ingo Molnar @ 2007-06-12  7:32 UTC (permalink / raw)
  To: Eric St-Laurent
  Cc: linux-kernel, linux-rt-users, Thomas Gleixner,
	Dinakar Guniguntala, Venki Pallipadi


(Cc:-ed Venki for the force-hpet issue below)

* Eric St-Laurent <ericstl34@sympatico.ca> wrote:

> On Sat, 2007-09-06 at 23:05 +0200, Ingo Molnar wrote:
> > i'm pleased to announce the v2.6.21.4-rt11 kernel, which can be 
> > downloaded from the usual place:
> 
> I'm running 2.6.21.4-rt12-cfs-v17 (x86_64), so far no problems. I like 
> this kernel a lot, it's feels quite smooth.

yeah, that's probably CFS in the works :-) That combined with PREEMPT_RT 
makes for a really snappy desktop.

> One little thing, no HPET timer is detected. By looking at the patch, 
> even the force detect code is there, it should work.
> 
> The hpet timer is not available as a clocksource and only one hpet
> related message is present in dmesg:
> 
> PM: Adding info for No Bus:hpet
> 
> This is on a Asus P5LD2-VM motherboard (ICH7)
> 
> Relevant config bits:
> 
> CONFIG_HPET_TIMER=y
> # CONFIG_HPET_EMULATE_RTC is not set
> CONFIG_HPET=y
> # CONFIG_HPET_RTC_IRQ is not set
> CONFIG_HPET_MMAP=y
> 
> Should I enable one of the two other options? Any ideas?

Venki, is this ICH7 board supposed to work with force-hpet?

	Ingo

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: v2.6.21.4-rt11
  2007-06-09 21:05 v2.6.21.4-rt11 Ingo Molnar
  2007-06-11  1:19 ` v2.6.21.4-rt11 Paul E. McKenney
@ 2007-06-12  6:03 ` Eric St-Laurent
  2007-06-12  7:32   ` v2.6.21.4-rt11 Ingo Molnar
  2007-06-17 16:15 ` v2.6.21.4-rt11 Nelson Castillo
  2 siblings, 1 reply; 39+ messages in thread
From: Eric St-Laurent @ 2007-06-12  6:03 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-rt-users, Thomas Gleixner, Dinakar Guniguntala

On Sat, 2007-09-06 at 23:05 +0200, Ingo Molnar wrote:
> i'm pleased to announce the v2.6.21.4-rt11 kernel, which can be 
> downloaded from the usual place:
>  

I'm running 2.6.21.4-rt12-cfs-v17 (x86_64), so far no problems. I like
this kernel a lot, it's feels quite smooth.

One little thing, no HPET timer is detected. By looking at the patch,
even the force detect code is there, it should work.

The hpet timer is not available as a clocksource and only one hpet
related message is present in dmesg:

PM: Adding info for No Bus:hpet

This is on a Asus P5LD2-VM motherboard (ICH7)

Relevant config bits:

CONFIG_HPET_TIMER=y
# CONFIG_HPET_EMULATE_RTC is not set
CONFIG_HPET=y
# CONFIG_HPET_RTC_IRQ is not set
CONFIG_HPET_MMAP=y

Should I enable one of the two other options? Any ideas?


Best regards,

- Eric



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: v2.6.21.4-rt11
  2007-06-11 20:44             ` v2.6.21.4-rt11 Paul E. McKenney
@ 2007-06-11 22:18               ` Paul E. McKenney
  2007-06-12 21:37                 ` v2.6.21.4-rt11 Ingo Molnar
  0 siblings, 1 reply; 39+ messages in thread
From: Paul E. McKenney @ 2007-06-11 22:18 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-rt-users, Thomas Gleixner, Dinakar Guniguntala

On Mon, Jun 11, 2007 at 01:44:27PM -0700, Paul E. McKenney wrote:
> On Mon, Jun 11, 2007 at 10:18:06AM -0700, Paul E. McKenney wrote:
> > On Mon, Jun 11, 2007 at 08:55:27AM -0700, Paul E. McKenney wrote:
> > > On Mon, Jun 11, 2007 at 05:38:55PM +0200, Ingo Molnar wrote:
> > > > 
> > > > * Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> > > > 
> > > > > > hm, what affinity do they start out with? Could they all be pinned 
> > > > > > to CPU#0 by default?
> > > > > 
> > > > > They start off with affinity masks of 0xf on a 4-CPU system.  I would 
> > > > > expect them to load-balance across the four CPUs, but they stay all on 
> > > > > the same CPU until long after I lose patience (many minutes).
> > > > 
> > > > ugh. Would be nice to figure out why this happens. I enabled rcutorture 
> > > > on a dual-core CPU and all the threads are spread evenly.
> > > 
> > > Here is the /proc/cpuinfo in case this helps.  I am starting up a test
> > > on a dual-core CPU to see if that works better.
> > 
> > And this quickly load-balanced to put a pair of readers on each CPU.
> > Later, it moved one of the readers so that it is now running with
> > one reader on one of the CPUs, and the remaining three readers on the
> > other CPU.
> > 
> > Argh...  this is with 2.6.21-rt1...  Need to reboot with 2.6.21.4-rt12...
> 
> OK, here are a couple of snapshots from "top" on a two-way system.
> It seems to cycle back and forth between these two states.

And on the 4-CPU box:

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND           
 3112 root      39  19     0    0    0 R 11.6  0.0   0:44.34 rcu_torture_rea   
 3114 root      39  19     0    0    0 R 11.6  0.0   0:44.34 rcu_torture_rea   
 3115 root      39  19     0    0    0 R 11.6  0.0   0:44.34 rcu_torture_rea   
 3116 root      39  19     0    0    0 R 11.6  0.0   0:44.34 rcu_torture_rea   
 3109 root      39  19     0    0    0 R 11.3  0.0   0:44.33 rcu_torture_rea   
 3110 root      39  19     0    0    0 R 11.3  0.0   0:44.33 rcu_torture_rea   
 3111 root      39  19     0    0    0 R 11.3  0.0   0:44.34 rcu_torture_rea   
 3113 root      39  19     0    0    0 R 11.3  0.0   0:44.34 rcu_torture_rea   
 3108 root      39  19     0    0    0 D  6.0  0.0   0:24.35 rcu_torture_wri   

All are on CPU zero:

elm3b6:~# cat /proc/3109/stat | awk '{print $(NF-3)}'
0
elm3b6:~# cat /proc/3110/stat | awk '{print $(NF-3)}'
0
elm3b6:~# cat /proc/3111/stat | awk '{print $(NF-3)}'
0
elm3b6:~# cat /proc/3112/stat | awk '{print $(NF-3)}'
0
elm3b6:~# cat /proc/3113/stat | awk '{print $(NF-3)}'
0
elm3b6:~# cat /proc/3114/stat | awk '{print $(NF-3)}'
0
elm3b6:~# cat /proc/3115/stat | awk '{print $(NF-3)}'
0
elm3b6:~# cat /proc/3116/stat | awk '{print $(NF-3)}'
0
elm3b6:~# cat /proc/3108/stat | awk '{print $(NF-3)}'
0

All have their affinity masks at f (allowing them to run on all CPUs):

elm3b6:~# taskset -p 3109
pid 3109's current affinity mask: f
elm3b6:~# taskset -p 3110
pid 3110's current affinity mask: f
elm3b6:~# taskset -p 3111
pid 3111's current affinity mask: f
elm3b6:~# taskset -p 3112
pid 3112's current affinity mask: f
elm3b6:~# taskset -p 3113
pid 3113's current affinity mask: f
elm3b6:~# taskset -p 3114
pid 3114's current affinity mask: f
elm3b6:~# taskset -p 3115
pid 3115's current affinity mask: f
elm3b6:~# taskset -p 3116
pid 3116's current affinity mask: f
elm3b6:~# taskset -p 3108
pid 3108's current affinity mask: f

Not a biggie for me, since I can easily do the taskset commands to
force the processes to spread out, but I am worried that casual users
of rcutorture won't know to do this -- thus not really torturing RCU.
It would not be hard to modify rcutorture to affinity the tasks so as
to spread them, but this seems a bit ugly.

						Thanx, Paul

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: v2.6.21.4-rt11
  2007-06-11 17:18           ` v2.6.21.4-rt11 Paul E. McKenney
@ 2007-06-11 20:44             ` Paul E. McKenney
  2007-06-11 22:18               ` v2.6.21.4-rt11 Paul E. McKenney
  0 siblings, 1 reply; 39+ messages in thread
From: Paul E. McKenney @ 2007-06-11 20:44 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-rt-users, Thomas Gleixner, Dinakar Guniguntala

On Mon, Jun 11, 2007 at 10:18:06AM -0700, Paul E. McKenney wrote:
> On Mon, Jun 11, 2007 at 08:55:27AM -0700, Paul E. McKenney wrote:
> > On Mon, Jun 11, 2007 at 05:38:55PM +0200, Ingo Molnar wrote:
> > > 
> > > * Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> > > 
> > > > > hm, what affinity do they start out with? Could they all be pinned 
> > > > > to CPU#0 by default?
> > > > 
> > > > They start off with affinity masks of 0xf on a 4-CPU system.  I would 
> > > > expect them to load-balance across the four CPUs, but they stay all on 
> > > > the same CPU until long after I lose patience (many minutes).
> > > 
> > > ugh. Would be nice to figure out why this happens. I enabled rcutorture 
> > > on a dual-core CPU and all the threads are spread evenly.
> > 
> > Here is the /proc/cpuinfo in case this helps.  I am starting up a test
> > on a dual-core CPU to see if that works better.
> 
> And this quickly load-balanced to put a pair of readers on each CPU.
> Later, it moved one of the readers so that it is now running with
> one reader on one of the CPUs, and the remaining three readers on the
> other CPU.
> 
> Argh...  this is with 2.6.21-rt1...  Need to reboot with 2.6.21.4-rt12...

OK, here are a couple of snapshots from "top" on a two-way system.
It seems to cycle back and forth between these two states.

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND           
20126 root      39  19     0    0    0 R   47  0.0  11:38.62 rcu_torture_rea    
20129 root      39  19     0    0    0 R   47  0.0  13:28.06 rcu_torture_rea    
20127 root      39  19     0    0    0 R   43  0.0  12:39.83 rcu_torture_rea    
20128 root      39  19     0    0    0 R   43  0.0  11:50.58 rcu_torture_rea    
20121 root      39  19     0    0    0 R   10  0.0   2:59.69 rcu_torture_wri    
20123 root      39  19     0    0    0 D    2  0.0   0:28.52 rcu_torture_fak    
20125 root      39  19     0    0    0 D    2  0.0   0:28.47 rcu_torture_fak    
20122 root      39  19     0    0    0 D    1  0.0   0:28.38 rcu_torture_fak    
20124 root      39  19     0    0    0 D    1  0.0   0:28.41 rcu_torture_fak    

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND           
20129 root      39  19     0    0    0 R   80  0.0  14:46.56 rcu_torture_rea    
20126 root      39  19     0    0    0 R   33  0.0  12:52.70 rcu_torture_rea    
20128 root      39  19     0    0    0 R   33  0.0  13:01.50 rcu_torture_rea    
20127 root      39  19     0    0    0 R   33  0.0  13:49.68 rcu_torture_rea    
20121 root      39  19     0    0    0 R   13  0.0   3:16.82 rcu_torture_wri    
20122 root      39  19     0    0    0 R    2  0.0   0:31.16 rcu_torture_fak    
20123 root      39  19     0    0    0 R    2  0.0   0:31.25 rcu_torture_fak    
20124 root      39  19     0    0    0 D    2  0.0   0:31.23 rcu_torture_fak    
20125 root      39  19     0    0    0 R    2  0.0   0:31.25 rcu_torture_fak    
12907 root      20   0 12576 1068  796 R    1  0.0   0:08.55 top                

The "preferred" state is the first one.  But given that the readers
will consume all CPU available to them, the scheduler might not be
able to tell the difference.

Perhaps the fakewriters are confusing the scheduler, will try again on
a 4-CPU machine leaving them out.

						Thanx, Paul

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: v2.6.21.4-rt11
  2007-06-11 15:55         ` v2.6.21.4-rt11 Paul E. McKenney
@ 2007-06-11 17:18           ` Paul E. McKenney
  2007-06-11 20:44             ` v2.6.21.4-rt11 Paul E. McKenney
  0 siblings, 1 reply; 39+ messages in thread
From: Paul E. McKenney @ 2007-06-11 17:18 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-rt-users, Thomas Gleixner, Dinakar Guniguntala

On Mon, Jun 11, 2007 at 08:55:27AM -0700, Paul E. McKenney wrote:
> On Mon, Jun 11, 2007 at 05:38:55PM +0200, Ingo Molnar wrote:
> > 
> > * Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> > 
> > > > hm, what affinity do they start out with? Could they all be pinned 
> > > > to CPU#0 by default?
> > > 
> > > They start off with affinity masks of 0xf on a 4-CPU system.  I would 
> > > expect them to load-balance across the four CPUs, but they stay all on 
> > > the same CPU until long after I lose patience (many minutes).
> > 
> > ugh. Would be nice to figure out why this happens. I enabled rcutorture 
> > on a dual-core CPU and all the threads are spread evenly.
> 
> Here is the /proc/cpuinfo in case this helps.  I am starting up a test
> on a dual-core CPU to see if that works better.

And this quickly load-balanced to put a pair of readers on each CPU.
Later, it moved one of the readers so that it is now running with
one reader on one of the CPUs, and the remaining three readers on the
other CPU.

Argh...  this is with 2.6.21-rt1...  Need to reboot with 2.6.21.4-rt12...

						Thanx, Paul

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: v2.6.21.4-rt11
  2007-06-11 15:38       ` v2.6.21.4-rt11 Ingo Molnar
@ 2007-06-11 15:55         ` Paul E. McKenney
  2007-06-11 17:18           ` v2.6.21.4-rt11 Paul E. McKenney
  0 siblings, 1 reply; 39+ messages in thread
From: Paul E. McKenney @ 2007-06-11 15:55 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-rt-users, Thomas Gleixner, Dinakar Guniguntala

On Mon, Jun 11, 2007 at 05:38:55PM +0200, Ingo Molnar wrote:
> 
> * Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> 
> > > hm, what affinity do they start out with? Could they all be pinned 
> > > to CPU#0 by default?
> > 
> > They start off with affinity masks of 0xf on a 4-CPU system.  I would 
> > expect them to load-balance across the four CPUs, but they stay all on 
> > the same CPU until long after I lose patience (many minutes).
> 
> ugh. Would be nice to figure out why this happens. I enabled rcutorture 
> on a dual-core CPU and all the threads are spread evenly.

Here is the /proc/cpuinfo in case this helps.  I am starting up a test
on a dual-core CPU to see if that works better.

							Thanx, Paul

processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 5
model name      : AMD Opteron(tm) Processor 844
stepping        : 8
cpu MHz         : 1793.105
cache size      : 1024 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall mmxext lm 3dnowext 3dnow
bogomips        : 3522.56

processor       : 1
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 5
model name      : AMD Opteron(tm) Processor 844
stepping        : 8
cpu MHz         : 1793.105
cache size      : 1024 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall mmxext lm 3dnowext 3dnow
bogomips        : 3579.90

processor       : 2
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 5
model name      : AMD Opteron(tm) Processor 844
stepping        : 8
cpu MHz         : 1793.105
cache size      : 1024 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall mmxext lm 3dnowext 3dnow
bogomips        : 3579.90

processor       : 3
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 5
model name      : AMD Opteron(tm) Processor 844
stepping        : 8
cpu MHz         : 1793.105
cache size      : 1024 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall mmxext lm 3dnowext 3dnow
bogomips        : 3579.90


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: v2.6.21.4-rt11
  2007-06-11 14:44     ` v2.6.21.4-rt11 Paul E. McKenney
@ 2007-06-11 15:38       ` Ingo Molnar
  2007-06-11 15:55         ` v2.6.21.4-rt11 Paul E. McKenney
  0 siblings, 1 reply; 39+ messages in thread
From: Ingo Molnar @ 2007-06-11 15:38 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, linux-rt-users, Thomas Gleixner, Dinakar Guniguntala


* Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:

> > hm, what affinity do they start out with? Could they all be pinned 
> > to CPU#0 by default?
> 
> They start off with affinity masks of 0xf on a 4-CPU system.  I would 
> expect them to load-balance across the four CPUs, but they stay all on 
> the same CPU until long after I lose patience (many minutes).

ugh. Would be nice to figure out why this happens. I enabled rcutorture 
on a dual-core CPU and all the threads are spread evenly.

	Ingo

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: v2.6.21.4-rt11
  2007-06-11  7:36   ` v2.6.21.4-rt11 Ingo Molnar
@ 2007-06-11 14:44     ` Paul E. McKenney
  2007-06-11 15:38       ` v2.6.21.4-rt11 Ingo Molnar
  0 siblings, 1 reply; 39+ messages in thread
From: Paul E. McKenney @ 2007-06-11 14:44 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-rt-users, Thomas Gleixner, Dinakar Guniguntala

On Mon, Jun 11, 2007 at 09:36:34AM +0200, Ingo Molnar wrote:
> 
> * Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> 
> > 2.6.21.4-rt12 boots on 4-CPU Opteron and passes several hours of 
> > rcutorture.  However, if I simply do "modprobe rcutorture", the kernel 
> > threads do not spread across the CPUs as I would expect them to, even 
> > given CFS.  Instead, the readers all stack up on a single CPU, and I 
> > have to use the "taskset" command to spread them out manually.  Is 
> > there some config parameter I am missing out on?
> 
> hm, what affinity do they start out with? Could they all be pinned to 
> CPU#0 by default?

They start off with affinity masks of 0xf on a 4-CPU system.  I would
expect them to load-balance across the four CPUs, but they stay all
on the same CPU until long after I lose patience (many minutes).

Since there are eight readers, I use the following commands:

	taskset -p 3 pid1
	taskset -p 3 pid2
	taskset -p 6 pid3
	taskset -p 6 pid4
	taskset -p c pid5
	taskset -p c pid6
	taskset -p 9 pid7
	taskset -p 9 pid8

where the "pidn" are all replaced by the pids of the torture readers.

Before I do this, the processes are all sharing a single CPU.  After I
do this, they are spread reasonably nicely over the CPUs.  I do need to
allow some migration in order to fully test the realtime RCU variants
in the various preemption scenarios.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: v2.6.21.4-rt11
  2007-06-11  1:19 ` v2.6.21.4-rt11 Paul E. McKenney
@ 2007-06-11  7:36   ` Ingo Molnar
  2007-06-11 14:44     ` v2.6.21.4-rt11 Paul E. McKenney
  0 siblings, 1 reply; 39+ messages in thread
From: Ingo Molnar @ 2007-06-11  7:36 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, linux-rt-users, Thomas Gleixner, Dinakar Guniguntala


* Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:

> 2.6.21.4-rt12 boots on 4-CPU Opteron and passes several hours of 
> rcutorture.  However, if I simply do "modprobe rcutorture", the kernel 
> threads do not spread across the CPUs as I would expect them to, even 
> given CFS.  Instead, the readers all stack up on a single CPU, and I 
> have to use the "taskset" command to spread them out manually.  Is 
> there some config parameter I am missing out on?

hm, what affinity do they start out with? Could they all be pinned to 
CPU#0 by default?

	Ingo

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: v2.6.21.4-rt11
  2007-06-09 21:05 v2.6.21.4-rt11 Ingo Molnar
@ 2007-06-11  1:19 ` Paul E. McKenney
  2007-06-11  7:36   ` v2.6.21.4-rt11 Ingo Molnar
  2007-06-12  6:03 ` v2.6.21.4-rt11 Eric St-Laurent
  2007-06-17 16:15 ` v2.6.21.4-rt11 Nelson Castillo
  2 siblings, 1 reply; 39+ messages in thread
From: Paul E. McKenney @ 2007-06-11  1:19 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-rt-users, Thomas Gleixner, Dinakar Guniguntala

On Sat, Jun 09, 2007 at 11:05:07PM +0200, Ingo Molnar wrote:
> 
> i'm pleased to announce the v2.6.21.4-rt11 kernel, which can be 
> downloaded from the usual place:
> 
>      http://people.redhat.com/mingo/realtime-preempt/
>   
> more info about the -rt patchset can be found in the RT wiki:
>    
>      http://rt.wiki.kernel.org
> 
> -rt11 is a bit more experimental than usual: it includes the CFS 
> scheduler. Several people have suggested the inclusion of CFS into the 
> -rt tree: the determinism of the CFS scheduler is a nice match to the 
> determinism offered by PREEMPT_RT. The port of CFS to -rt was done by 
> Dinakar Guniguntala. Tested on i686 and x86_64.
> 
> to build a 2.6.21.4-rt11 tree, the following patches should be applied:
>   
>     http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.21.4.tar.bz2
>     http://people.redhat.com/mingo/realtime-preempt/patch-2.6.21.4-rt11

2.6.21.4-rt12 boots on 4-CPU Opteron and passes several hours of
rcutorture.  However, if I simply do "modprobe rcutorture", the kernel
threads do not spread across the CPUs as I would expect them to, even
given CFS.  Instead, the readers all stack up on a single CPU, and I
have to use the "taskset" command to spread them out manually.  Is there
some config parameter I am missing out on?

						Thanx, Paul

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: v2.6.21.4-rt11
@ 2007-06-10 17:50 Miguel Botón
  0 siblings, 0 replies; 39+ messages in thread
From: Miguel Botón @ 2007-06-10 17:50 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel

On Sunday 10 June 2007 15:17, Ingo Molnar wrote:
> -rt11 is a bit more experimental than usual: it includes the CFS 
> scheduler.

Great! Finally CFS is included ;)

Right now I'm using a patched kernel (2.6.21.4) with realtime-preemption patch 
and it works fine but I noticed something that I think you should know.

There's a problem with mac80211. I'm using "mac80211-8.0.1" 
and "iwlwifi-0.0.25" driver with my "Intel Pro Wireless 3945ABG" card.
When loading the "iwl3945" module or when an application (like wpa_supplicant, 
dhcpcd...) tries to do something with the card, I get this message in dmesg:

BUG: using smp_processor_id() in preemptible [00000000] code: 
wpa_supplicant/11659
caller is ieee80211_set_multicast_list+0x40/0x163 [mac80211]
 [<c0213b1d>] debug_smp_processor_id+0xad/0xb0
 [<f8e860bc>] ieee80211_set_multicast_list+0x40/0x163 [mac80211]
 [<c02e8532>] __dev_mc_upload+0x22/0x23
 [<c02e8686>] dev_mc_upload+0x24/0x37
 [<c02e52f5>] dev_change_flags+0x26/0xf6
 [<c031fc5e>] devinet_ioctl+0x539/0x6aa
 [<c02db972>] sock_ioctl+0xa2/0x1d5
 [<c02db8d0>] sock_ioctl+0x0/0x1d5
 [<c018500f>] do_ioctl+0x1f/0x6d
 [<c01850ad>] vfs_ioctl+0x50/0x273
 [<c0185304>] sys_ioctl+0x34/0x50
 [<c01040c6>] sysenter_past_esp+0x5f/0x85
 [<c0340000>] pfkey_add+0x7c7/0x8d9
 =======================
---------------------------
| preempt count: 00000001 ]
| 1-level deep critical section nesting:
----------------------------------------
.. [<c0213ac4>] .... debug_smp_processor_id+0x54/0xb0
.....[<00000000>] ..   ( <= _stext+0x3fefed0c/0xc)

Anyway, the wifi card works fine.

I got rid of this message by commenting the code of the 
function "ieee80211_set_multicast_list()" that's 
on "net/mac80211/ieee80211.c" but this isn't a proper fix.

I think you should know about this because kernel 2.6.22 already includes 
mac80211.

Greetings.

-- 
Miguel Botón

^ permalink raw reply	[flat|nested] 39+ messages in thread

* v2.6.21.4-rt11
@ 2007-06-09 21:05 Ingo Molnar
  2007-06-11  1:19 ` v2.6.21.4-rt11 Paul E. McKenney
                   ` (2 more replies)
  0 siblings, 3 replies; 39+ messages in thread
From: Ingo Molnar @ 2007-06-09 21:05 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-rt-users, Thomas Gleixner, Dinakar Guniguntala


i'm pleased to announce the v2.6.21.4-rt11 kernel, which can be 
downloaded from the usual place:
 
     http://people.redhat.com/mingo/realtime-preempt/
  
more info about the -rt patchset can be found in the RT wiki:
   
     http://rt.wiki.kernel.org

-rt11 is a bit more experimental than usual: it includes the CFS 
scheduler. Several people have suggested the inclusion of CFS into the 
-rt tree: the determinism of the CFS scheduler is a nice match to the 
determinism offered by PREEMPT_RT. The port of CFS to -rt was done by 
Dinakar Guniguntala. Tested on i686 and x86_64.
 
to build a 2.6.21.4-rt11 tree, the following patches should be applied:
  
    http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.21.4.tar.bz2
    http://people.redhat.com/mingo/realtime-preempt/patch-2.6.21.4-rt11
  
	Ingo

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2007-06-19 19:16 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20070613180451.GA16628@elte.hu>
     [not found] ` <20070613184741.GC8125@linux.vnet.ibm.com>
     [not found]   ` <20070613185522.GA27335@elte.hu>
     [not found]     ` <20070613233910.GJ8125@linux.vnet.ibm.com>
     [not found]       ` <20070615144535.GA12078@elte.hu>
     [not found]         ` <20070615151452.GC9301@linux.vnet.ibm.com>
     [not found]           ` <20070615195545.GA28872@elte.hu>
     [not found]             ` <20070616011605.GH9301@linux.vnet.ibm.com>
     [not found]               ` <20070616084434.GG2559@linux.vnet.ibm.com>
     [not found]                 ` <20070616161213.GA2994@linux.vnet.ibm.com>
2007-06-18 15:12                   ` v2.6.21.4-rt11 Srivatsa Vaddagiri
2007-06-18 16:54                     ` v2.6.21.4-rt11 Christoph Lameter
2007-06-18 17:35                       ` v2.6.21.4-rt11 Srivatsa Vaddagiri
2007-06-18 17:59                         ` v2.6.21.4-rt11 Christoph Lameter
2007-06-19  1:52                           ` v2.6.21.4-rt11 Srivatsa Vaddagiri
2007-06-19  2:13                             ` v2.6.21.4-rt11 Siddha, Suresh B
2007-06-19  2:15                           ` v2.6.21.4-rt11 Siddha, Suresh B
2007-06-19  3:46                             ` v2.6.21.4-rt11 Christoph Lameter
2007-06-19  5:49                               ` v2.6.21.4-rt11 Srivatsa Vaddagiri
2007-06-19  8:07                                 ` v2.6.21.4-rt11 Ingo Molnar
2007-06-18 18:06                     ` v2.6.21.4-rt11 Srivatsa Vaddagiri
2007-06-19  9:04                     ` v2.6.21.4-rt11 Ingo Molnar
2007-06-19 10:43                       ` v2.6.21.4-rt11 Srivatsa Vaddagiri
2007-06-19 14:33                       ` v2.6.21.4-rt11 Srivatsa Vaddagiri
2007-06-19 19:15                         ` v2.6.21.4-rt11 Christoph Lameter
2007-06-19 15:08                       ` v2.6.21.4-rt11 Paul E. McKenney
2007-06-19 19:14                       ` v2.6.21.4-rt11 Christoph Lameter
2007-06-10 17:50 v2.6.21.4-rt11 Miguel Botón
  -- strict thread matches above, loose matches on Subject: below --
2007-06-09 21:05 v2.6.21.4-rt11 Ingo Molnar
2007-06-11  1:19 ` v2.6.21.4-rt11 Paul E. McKenney
2007-06-11  7:36   ` v2.6.21.4-rt11 Ingo Molnar
2007-06-11 14:44     ` v2.6.21.4-rt11 Paul E. McKenney
2007-06-11 15:38       ` v2.6.21.4-rt11 Ingo Molnar
2007-06-11 15:55         ` v2.6.21.4-rt11 Paul E. McKenney
2007-06-11 17:18           ` v2.6.21.4-rt11 Paul E. McKenney
2007-06-11 20:44             ` v2.6.21.4-rt11 Paul E. McKenney
2007-06-11 22:18               ` v2.6.21.4-rt11 Paul E. McKenney
2007-06-12 21:37                 ` v2.6.21.4-rt11 Ingo Molnar
2007-06-13  1:27                   ` v2.6.21.4-rt11 Paul E. McKenney
2007-06-12  6:03 ` v2.6.21.4-rt11 Eric St-Laurent
2007-06-12  7:32   ` v2.6.21.4-rt11 Ingo Molnar
2007-06-12 13:00     ` v2.6.21.4-rt11 Pallipadi, Venkatesh
2007-06-13  1:37       ` v2.6.21.4-rt11 Eric St-Laurent
2007-06-17 16:15 ` v2.6.21.4-rt11 Nelson Castillo
2007-06-17 16:43   ` v2.6.21.4-rt11 Thomas Gleixner
2007-06-17 16:49     ` v2.6.21.4-rt11 Nelson Castillo
2007-06-17 16:59       ` v2.6.21.4-rt11 Thomas Gleixner
2007-06-18 16:14         ` v2.6.21.4-rt11 Katsuya MATSUBARA
2007-06-19  4:04           ` v2.6.21.4-rt11 Thomas Gleixner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).