linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* RE: 2.6.1 and irq balancing
@ 2004-01-11 23:59 Nakajima, Jun
  2004-01-12  4:42 ` Bill Davidsen
  2004-01-13  6:50 ` Ethan Weinstein
  0 siblings, 2 replies; 18+ messages in thread
From: Nakajima, Jun @ 2004-01-11 23:59 UTC (permalink / raw)
  To: Ethan Weinstein, Ed Tomlinson; +Cc: linux-kernel, piggin, Kamble, Nitin A

> Admittedly, the machine's load was not high when I took this sample.
> However, creating a great deal of load does not change these
statistics
> at all.  Being that there are patches available for 2.4.x kernels to
fix
> this, I don't think this at all by design, but what do I know? =)

2.6 kernels don't need a patch to it as far as I understand. Are you
saying that with significant amount of load, you did not see any
distribution of interrupts? Today's threshold in the kernel is high
because we found moving around interrupts frequently rather hurt the
cache and thus lower the performance compared to "do nothing". Can you
try to create significant load with your network (eth0 and eh1) and see
what happens? 

Jun 

> -----Original Message-----
> From: linux-kernel-owner@vger.kernel.org [mailto:linux-kernel-
> owner@vger.kernel.org] On Behalf Of Ethan Weinstein
> Sent: Saturday, January 10, 2004 9:19 PM
> To: Ed Tomlinson
> Cc: linux-kernel@vger.kernel.org; piggin@cyberone.com.au
> Subject: Re: 2.6.1 and irq balancing
> 
> Ed Tomlinson wrote:
> > Hi,
> >
> > What is the load on the box when this is happening?  If its low
think
> > this is optimal (for cache reasons).
> >
> 
> Admittedly, the machine's load was not high when I took this sample.
> However, creating a great deal of load does not change these
statistics
> at all.  Being that there are patches available for 2.4.x kernels to
fix
> this, I don't think this at all by design, but what do I know? =)
> 
> 2.6.0 running on a non-HT SMP machine I have (old Compaq proliant
> 2xPentium2) does interrupt on all CPU's with "noirqbalance" bootparam.
> 
> Regarding the keyboard, I noticed something interesting
> 
> 2.6.1-rc1 shows the i8042 in /proc/interrupts:
> 
>    1:       1871          0          0          0    IO-APIC-edge
i8042
> 
> (keyboard still does not work, though..)
> 
> 2.6.1 final does not show this at all, and [kseriod] eats a constant
5%
>   CPU.  Something's awry =)
> 
> 
> -Ethan
> -
> To unsubscribe from this list: send the line "unsubscribe
linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 2.6.1 and irq balancing
  2004-01-11 23:59 2.6.1 and irq balancing Nakajima, Jun
@ 2004-01-12  4:42 ` Bill Davidsen
  2004-01-12 14:06   ` Zwane Mwaikambo
  2004-01-12 16:10   ` Martin J. Bligh
  2004-01-13  6:50 ` Ethan Weinstein
  1 sibling, 2 replies; 18+ messages in thread
From: Bill Davidsen @ 2004-01-12  4:42 UTC (permalink / raw)
  To: linux-kernel

Nakajima, Jun wrote:

> 2.6 kernels don't need a patch to it as far as I understand. Are you
> saying that with significant amount of load, you did not see any
> distribution of interrupts? Today's threshold in the kernel is high
> because we found moving around interrupts frequently rather hurt the
> cache and thus lower the performance compared to "do nothing". Can you
> try to create significant load with your network (eth0 and eh1) and see
> what happens? 

How much is significant? The term doesn't really help much. I will say 
that with one NIC taking 120MB/sec of data to a TB database and copying 
to two other machine (~220MB)  my interrupts got up in in the 5k-12k 
range with essentially CPU0 doing the work, some few percent going to CPU2.

I'm not sure this is a problem in any way, but some serious load is 
needed to trigger sharing, if indeed the NIC was the source of the ints 
on CPU2.

2x Xeon-2.4GHz, HT enabled. "CPU2" from memory, it was the other 
physical CPU, not another sibling. Worked fine, didn't break, don't 
regard it as a problem.

-- 
bill davidsen <davidsen@tmr.com>
   CTO TMR Associates, Inc
   Doing interesting things with small computers since 1979

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 2.6.1 and irq balancing
  2004-01-12  4:42 ` Bill Davidsen
@ 2004-01-12 14:06   ` Zwane Mwaikambo
  2004-01-12 16:10   ` Martin J. Bligh
  1 sibling, 0 replies; 18+ messages in thread
From: Zwane Mwaikambo @ 2004-01-12 14:06 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: Linux Kernel, Nakajima, Jun

On Sun, 11 Jan 2004, Bill Davidsen wrote:

> I'm not sure this is a problem in any way, but some serious load is
> needed to trigger sharing, if indeed the NIC was the source of the ints
> on CPU2.
>
> 2x Xeon-2.4GHz, HT enabled. "CPU2" from memory, it was the other
> physical CPU, not another sibling. Worked fine, didn't break, don't
> regard it as a problem.

Seems to be ok here, 2x Xeon 2.0GHz w/ HT

           CPU0       CPU1       CPU2       CPU3
  0:  322303400          0          0          0    IO-APIC-edge  timer
  1:     255712          0          0          0    IO-APIC-edge  i8042
  2:          0          0          0          0          XT-PIC  cascade
  3:        828          0          0          0    IO-APIC-edge  serial
  4:      70682          0          0          0    IO-APIC-edge  serial
  8:          1          0          0          0    IO-APIC-edge  rtc
  9:          0          0          0          0   IO-APIC-level  acpi
 12:    1244334          0          0          0    IO-APIC-edge  i8042
 14:   21708437        129   18313205       1492    IO-APIC-edge  ide0
 15:   10859481         78    9907876         64    IO-APIC-edge  ide1
 16:        655          0          0          0   IO-APIC-level  uhci_hcd
 19:      34458          0          0          0   IO-APIC-level  uhci_hcd, serial
 20:    5762897          0    1009019          0   IO-APIC-level  eth0
 22:    4985838         60    4908642         42   IO-APIC-level  ide2, ide3
 23:    3667147          0          0          0   IO-APIC-level  eth1
 48:    8459639          0          0          0   IO-APIC-level  EMU10K1
NMI:          0          0          0          0
LOC:  322346360  322346360  322346359  322346358
ERR:          0
MIS:          0


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 2.6.1 and irq balancing
  2004-01-12  4:42 ` Bill Davidsen
  2004-01-12 14:06   ` Zwane Mwaikambo
@ 2004-01-12 16:10   ` Martin J. Bligh
  1 sibling, 0 replies; 18+ messages in thread
From: Martin J. Bligh @ 2004-01-12 16:10 UTC (permalink / raw)
  To: Bill Davidsen, linux-kernel

> How much is significant? The term doesn't really help much. I will say that with one NIC taking 120MB/sec of data to a TB database and copying to two other machine (~220MB)  my interrupts got up in in the 5k-12k range with essentially CPU0 doing the work, some few percent going to CPU2.


1010 per second, IIRC. Try this patch:

diff -aurpN -X /home/fletch/.diff.exclude 290-gfp_node_strict/arch/i386/kernel/io_apic.c 310-irqbal_fast/arch/i386/kernel/io_apic.c
--- 290-gfp_node_strict/arch/i386/kernel/io_apic.c	Fri Jan  9 22:25:48 2004
+++ 310-irqbal_fast/arch/i386/kernel/io_apic.c	Fri Jan  9 22:27:55 2004
@@ -401,7 +401,7 @@ static void do_irq_balance(void)
 	unsigned long max_cpu_irq = 0, min_cpu_irq = (~0);
 	unsigned long move_this_load = 0;
 	int max_loaded = 0, min_loaded = 0;
-	unsigned long useful_load_threshold = balanced_irq_interval + 10;
+	unsigned long useful_load_threshold = balanced_irq_interval / 10;
 	int selected_irq;
 	int tmp_loaded, first_attempt = 1;
 	unsigned long tmp_cpu_irq;

M.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 2.6.1 and irq balancing
  2004-01-11 23:59 2.6.1 and irq balancing Nakajima, Jun
  2004-01-12  4:42 ` Bill Davidsen
@ 2004-01-13  6:50 ` Ethan Weinstein
  2004-01-13  7:05   ` Nick Piggin
  1 sibling, 1 reply; 18+ messages in thread
From: Ethan Weinstein @ 2004-01-13  6:50 UTC (permalink / raw)
  To: Nakajima, Jun; +Cc: Ed Tomlinson, linux-kernel, piggin, Kamble, Nitin A

Nakajima, Jun wrote:

>> Admittedly, the machine's load was not high when I took this sample.
>> However, creating a great deal of load does not change these statistics 
>> at all.  Being that there are patches available for 2.4.x kernels to 
>> fix this, I don't think this at all by design, but what do I know? =)
>>  

> 2.6 kernels don't need a patch to it as far as I understand. Are you
> saying that with significant amount of load, you did not see any
> distribution of interrupts? Today's threshold in the kernel is high
> because we found moving around interrupts frequently rather hurt the
> cache and thus lower the performance compared to "do nothing". Can you
> try to create significant load with your network (eth0 and eh1) and see
> what happens? 
> 
> Jun 

Here's the situation two days later, I created some brief periods of 
high load on eth1 and I see we have some change:


            CPU0       CPU1       CPU2       CPU3
   0:  184932542          0    2592511          0    IO-APIC-edge  timer
   1:       1875          0          0          0    IO-APIC-edge  i8042
   2:          0          0          0          0          XT-PIC  cascade
   3:    3046103          0          0          0    IO-APIC-edge  serial
   8:          2          0          0          0    IO-APIC-edge  rtc
   9:          0          0          0          0   IO-APIC-level  acpi
  14:         76          0          0          0    IO-APIC-edge  ide0
  16:    2978264          0          0          0   IO-APIC-level  sym53c8xx
  22:    7838940          0          0          0   IO-APIC-level  eth0
  48:     916078          0     125150          0   IO-APIC-level  aic79xx
  49:    1099375          0          0          0   IO-APIC-level  aic79xx
  54:   51484241        316   50560879        279   IO-APIC-level  eth1
NMI:          0          0          0          0
LOC:  187530735  187530988  187530981  187530986
ERR:          0
MIS:          0


My argument is (see below).  This is an old 2x pentium2 @400, also 
running 2.6, an old Compaq Proliant to be exact.  This machine obviously 
has no HT, so why the balanced load?


            CPU0       CPU1
   0: 1066522197 1117196193    IO-APIC-edge  timer
   1:         42         19    IO-APIC-edge  i8042
   2:          0          0          XT-PIC  cascade
   5:   23523428   23510845   IO-APIC-level  TLAN
   8:          0          4    IO-APIC-edge  rtc
   9:         15         15   IO-APIC-level  sym53c8xx
  10:    6874323    6809042   IO-APIC-level  sym53c8xx
  11:    7545802    7509034   IO-APIC-level  ida0
  14:          8          2    IO-APIC-edge  ide0
NMI:          0          0
LOC: 2183867261 2183867237
ERR:          0
MIS:          0



Ethan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 2.6.1 and irq balancing
  2004-01-13  6:50 ` Ethan Weinstein
@ 2004-01-13  7:05   ` Nick Piggin
  0 siblings, 0 replies; 18+ messages in thread
From: Nick Piggin @ 2004-01-13  7:05 UTC (permalink / raw)
  To: Ethan Weinstein
  Cc: Nakajima, Jun, Ed Tomlinson, linux-kernel, Kamble, Nitin A



Ethan Weinstein wrote:

> Nakajima, Jun wrote:
>
>>> Admittedly, the machine's load was not high when I took this sample.
>>> However, creating a great deal of load does not change these 
>>> statistics at all.  Being that there are patches available for 2.4.x 
>>> kernels to fix this, I don't think this at all by design, but what 
>>> do I know? =)
>>>  
>>
>
>> 2.6 kernels don't need a patch to it as far as I understand. Are you
>> saying that with significant amount of load, you did not see any
>> distribution of interrupts? Today's threshold in the kernel is high
>> because we found moving around interrupts frequently rather hurt the
>> cache and thus lower the performance compared to "do nothing". Can you
>> try to create significant load with your network (eth0 and eh1) and see
>> what happens?
>> Jun 
>
>
> Here's the situation two days later, I created some brief periods of 
> high load on eth1 and I see we have some change:
>
>
>            CPU0       CPU1       CPU2       CPU3
>   0:  184932542          0    2592511          0    IO-APIC-edge  timer
>   1:       1875          0          0          0    IO-APIC-edge  i8042
>   2:          0          0          0          0          XT-PIC  cascade
>   3:    3046103          0          0          0    IO-APIC-edge  serial
>   8:          2          0          0          0    IO-APIC-edge  rtc
>   9:          0          0          0          0   IO-APIC-level  acpi
>  14:         76          0          0          0    IO-APIC-edge  ide0
>  16:    2978264          0          0          0   IO-APIC-level  
> sym53c8xx
>  22:    7838940          0          0          0   IO-APIC-level  eth0
>  48:     916078          0     125150          0   IO-APIC-level  aic79xx
>  49:    1099375          0          0          0   IO-APIC-level  aic79xx
>  54:   51484241        316   50560879        279   IO-APIC-level  eth1
> NMI:          0          0          0          0
> LOC:  187530735  187530988  187530981  187530986
> ERR:          0
> MIS:          0
>


Aside from the obvious imbalance between physical CPUs:
I think interrupts should be much more freely balanced between siblings
that share cache, otherwise process a running on CPU0 gets less time than
process b running on CPU1 because of the interrupt load.


>
> My argument is (see below).  This is an old 2x pentium2 @400, also 
> running 2.6, an old Compaq Proliant to be exact.  This machine 
> obviously has no HT, so why the balanced load?


IIRC the P2/3 APICs are set to a round robin delivery mode while the P4
ones are not. It is still not ideal though, while you have fairness, you now
have suboptimal performance.



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 2.6.1 and irq balancing
  2004-01-11 16:50   ` Joe Korty
  2004-01-11 18:19     ` Arjan van de Ven
@ 2004-01-15 11:43     ` Pavel Machek
  1 sibling, 0 replies; 18+ messages in thread
From: Pavel Machek @ 2004-01-15 11:43 UTC (permalink / raw)
  To: Joe Korty
  Cc: Arjan van de Ven, Ethan Weinstein, linux-kernel, William Lee Irwin III

Hi!

> > > Greetings all,
> > > 
> > > I upgraded my server to 2.6.1, and I'm finding I'm saddled with only 
> > > interrupting on CPU0 again. 2.6.0 does this as well. This is the 
> > > Supermicro X5DPL-iGM-O (E7501 chipset), 2 Xeons@2.4ghz HT enabled. 
> > > /proc/cpuinfo is normal as per HT, displaying 4 cpus.
> > 
> > you should run the userspace irq balance daemon:
> > http://people.redhat.com/arjanv/irqbalance/
> 
> I have long wondered what is so evil about most interrupts going to
> CPU 0 that we felt we had to have a pair of irqdaemons in 2.6.  From my
> (admittedly imperfect) experience, the APIC will route an interrupt to
> CPU 1 if CPU 0 is busy with another interrupt, to CPU 2 if 0 and 1 are
> so occupied, and so on.  I see no harm in this other than the strangely
> lopsided /proc/interrupt displays, which I can live with.

Well, imagine 8 CPU machine with high interrupt load. Poor process
that gets scheduled on CPU#0 does little progress, but is shown as
eating one whole CPU.
								Pavel
-- 
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: 2.6.1 and irq balancing
@ 2004-01-13  8:09 Nakajima, Jun
  0 siblings, 0 replies; 18+ messages in thread
From: Nakajima, Jun @ 2004-01-13  8:09 UTC (permalink / raw)
  To: Nick Piggin, Ethan Weinstein; +Cc: Ed Tomlinson, linux-kernel, Kamble, Nitin A

> Aside from the obvious imbalance between physical CPUs:
> I think interrupts should be much more freely balanced between
siblings
> that share cache, otherwise process a running on CPU0 gets less time
than
> process b running on CPU1 because of the interrupt load.
>
That scheduling issue is true. Today we balance interrupt load on a
package (i.e. physical CPU) basis, and we don't care which logical
processors do the interrupt handling because it should not matter in
terms of performance. 

Jun

> -----Original Message-----
> From: Nick Piggin [mailto:piggin@cyberone.com.au]
> Sent: Monday, January 12, 2004 11:05 PM
> To: Ethan Weinstein
> Cc: Nakajima, Jun; Ed Tomlinson; linux-kernel@vger.kernel.org; Kamble,
> Nitin A
> Subject: Re: 2.6.1 and irq balancing
> 
> 
> 
> Ethan Weinstein wrote:
> 
> > Nakajima, Jun wrote:
> >
> >>> Admittedly, the machine's load was not high when I took this
sample.
> >>> However, creating a great deal of load does not change these
> >>> statistics at all.  Being that there are patches available for
2.4.x
> >>> kernels to fix this, I don't think this at all by design, but what
> >>> do I know? =)
> >>>
> >>
> >
> >> 2.6 kernels don't need a patch to it as far as I understand. Are
you
> >> saying that with significant amount of load, you did not see any
> >> distribution of interrupts? Today's threshold in the kernel is high
> >> because we found moving around interrupts frequently rather hurt
the
> >> cache and thus lower the performance compared to "do nothing". Can
you
> >> try to create significant load with your network (eth0 and eh1) and
see
> >> what happens?
> >> Jun
> >
> >
> > Here's the situation two days later, I created some brief periods of
> > high load on eth1 and I see we have some change:
> >
> >
> >            CPU0       CPU1       CPU2       CPU3
> >   0:  184932542          0    2592511          0    IO-APIC-edge
timer
> >   1:       1875          0          0          0    IO-APIC-edge
i8042
> >   2:          0          0          0          0          XT-PIC
> cascade
> >   3:    3046103          0          0          0    IO-APIC-edge
serial
> >   8:          2          0          0          0    IO-APIC-edge
rtc
> >   9:          0          0          0          0   IO-APIC-level
acpi
> >  14:         76          0          0          0    IO-APIC-edge
ide0
> >  16:    2978264          0          0          0   IO-APIC-level
> > sym53c8xx
> >  22:    7838940          0          0          0   IO-APIC-level
eth0
> >  48:     916078          0     125150          0   IO-APIC-level
> aic79xx
> >  49:    1099375          0          0          0   IO-APIC-level
> aic79xx
> >  54:   51484241        316   50560879        279   IO-APIC-level
eth1
> > NMI:          0          0          0          0
> > LOC:  187530735  187530988  187530981  187530986
> > ERR:          0
> > MIS:          0
> >
> 
> 
> Aside from the obvious imbalance between physical CPUs:
> I think interrupts should be much more freely balanced between
siblings
> that share cache, otherwise process a running on CPU0 gets less time
than
> process b running on CPU1 because of the interrupt load.
> 
> 
> >
> > My argument is (see below).  This is an old 2x pentium2 @400, also
> > running 2.6, an old Compaq Proliant to be exact.  This machine
> > obviously has no HT, so why the balanced load?
> 
> 
> IIRC the P2/3 APICs are set to a round robin delivery mode while the
P4
> ones are not. It is still not ideal though, while you have fairness,
you
> now
> have suboptimal performance.
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: 2.6.1 and irq balancing
@ 2004-01-13  7:57 Nakajima, Jun
  0 siblings, 0 replies; 18+ messages in thread
From: Nakajima, Jun @ 2004-01-13  7:57 UTC (permalink / raw)
  To: Ethan Weinstein; +Cc: Ed Tomlinson, linux-kernel, piggin, Kamble, Nitin A

I don't see a major problem there. If you look at eth1, it has the
biggest load with order of difference. Interrupts from disk controllers
cannot be major unless you have a lot of disks (tens, hundreds) given a
time period.

>   16:    2978264          0          0          0   IO-APIC-level
> sym53c8xx
>   22:    7838940          0          0          0   IO-APIC-level
eth0
>   48:     916078          0     125150          0   IO-APIC-level
aic79xx
>   49:    1099375          0          0          0   IO-APIC-level
aic79xx
>   54:   51484241        316   50560879        279   IO-APIC-level
eth1

Timer is a bit tricky because it continuously generates interrupts 1000
per sec, and it looks higher as the results. But given a time period,
the timer interrupts should be nothing. 

>    0:  184932542          0    2592511          0    IO-APIC-edge
timer


>             CPU0       CPU1
>    0: 1066522197 1117196193    IO-APIC-edge  timer
>    1:         42         19    IO-APIC-edge  i8042
>    2:          0          0          XT-PIC  cascade
>    5:   23523428   23510845   IO-APIC-level  TLAN
>    8:          0          4    IO-APIC-edge  rtc
>    9:         15         15   IO-APIC-level  sym53c8xx
>   10:    6874323    6809042   IO-APIC-level  sym53c8xx
>   11:    7545802    7509034   IO-APIC-level  ida0
>   14:          8          2    IO-APIC-edge  ide0
> NMI:          0          0
> LOC: 2183867261 2183867237
> ERR:          0
> MIS:          0

The above is generated by the round-robin interrupt distribution
provided by the chipset. It's very easy to do this on P4P systems
(that's basically the initial Ingo's patch), but the performance tends
to be worse if you do that. In fact, we saw better results on PIII
systems if we used the one in the kernel, i.e. irqbalance, rather than
the one by the chipset (round-robin).

Jun

> -----Original Message-----
> From: Ethan Weinstein [mailto:lists@stinkfoot.org]
> Sent: Monday, January 12, 2004 10:50 PM
> To: Nakajima, Jun
> Cc: Ed Tomlinson; linux-kernel@vger.kernel.org;
piggin@cyberone.com.au;
> Kamble, Nitin A
> Subject: Re: 2.6.1 and irq balancing
> 
> Nakajima, Jun wrote:
> 
> >> Admittedly, the machine's load was not high when I took this
sample.
> >> However, creating a great deal of load does not change these
statistics
> >> at all.  Being that there are patches available for 2.4.x kernels
to
> >> fix this, I don't think this at all by design, but what do I know?
=)
> >>
> 
> > 2.6 kernels don't need a patch to it as far as I understand. Are you
> > saying that with significant amount of load, you did not see any
> > distribution of interrupts? Today's threshold in the kernel is high
> > because we found moving around interrupts frequently rather hurt the
> > cache and thus lower the performance compared to "do nothing". Can
you
> > try to create significant load with your network (eth0 and eh1) and
see
> > what happens?
> >
> > Jun
> 
> Here's the situation two days later, I created some brief periods of
> high load on eth1 and I see we have some change:
> 
> 
>             CPU0       CPU1       CPU2       CPU3
>    0:  184932542          0    2592511          0    IO-APIC-edge
timer
>    1:       1875          0          0          0    IO-APIC-edge
i8042
>    2:          0          0          0          0          XT-PIC
cascade
>    3:    3046103          0          0          0    IO-APIC-edge
serial
>    8:          2          0          0          0    IO-APIC-edge  rtc
>    9:          0          0          0          0   IO-APIC-level
acpi
>   14:         76          0          0          0    IO-APIC-edge
ide0
>   16:    2978264          0          0          0   IO-APIC-level
> sym53c8xx
>   22:    7838940          0          0          0   IO-APIC-level
eth0
>   48:     916078          0     125150          0   IO-APIC-level
aic79xx
>   49:    1099375          0          0          0   IO-APIC-level
aic79xx
>   54:   51484241        316   50560879        279   IO-APIC-level
eth1
> NMI:          0          0          0          0
> LOC:  187530735  187530988  187530981  187530986
> ERR:          0
> MIS:          0
> 
> 
> My argument is (see below).  This is an old 2x pentium2 @400, also
> running 2.6, an old Compaq Proliant to be exact.  This machine
obviously
> has no HT, so why the balanced load?
> 
> 
>             CPU0       CPU1
>    0: 1066522197 1117196193    IO-APIC-edge  timer
>    1:         42         19    IO-APIC-edge  i8042
>    2:          0          0          XT-PIC  cascade
>    5:   23523428   23510845   IO-APIC-level  TLAN
>    8:          0          4    IO-APIC-edge  rtc
>    9:         15         15   IO-APIC-level  sym53c8xx
>   10:    6874323    6809042   IO-APIC-level  sym53c8xx
>   11:    7545802    7509034   IO-APIC-level  ida0
>   14:          8          2    IO-APIC-edge  ide0
> NMI:          0          0
> LOC: 2183867261 2183867237
> ERR:          0
> MIS:          0
> 
> 
> 
> Ethan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 2.6.1 and irq balancing
  2004-01-11 16:50   ` Joe Korty
@ 2004-01-11 18:19     ` Arjan van de Ven
  2004-01-15 11:43     ` Pavel Machek
  1 sibling, 0 replies; 18+ messages in thread
From: Arjan van de Ven @ 2004-01-11 18:19 UTC (permalink / raw)
  To: Joe Korty; +Cc: Ethan Weinstein, linux-kernel, William Lee Irwin III

[-- Attachment #1: Type: text/plain, Size: 1108 bytes --]

On Sun, Jan 11, 2004 at 11:50:12AM -0500, Joe Korty wrote:
> On Sun, Jan 11, 2004 at 10:51:22AM +0100, Arjan van de Ven wrote:
> > On Sun, 2004-01-11 at 00:14, Ethan Weinstein wrote:
> > > Greetings all,
> > > 
> > > I upgraded my server to 2.6.1, and I'm finding I'm saddled with only 
> > > interrupting on CPU0 again. 2.6.0 does this as well. This is the 
> > > Supermicro X5DPL-iGM-O (E7501 chipset), 2 Xeons@2.4ghz HT enabled. 
> > > /proc/cpuinfo is normal as per HT, displaying 4 cpus.
> > 
> > you should run the userspace irq balance daemon:
> > http://people.redhat.com/arjanv/irqbalance/
> 
> I have long wondered what is so evil about most interrupts going to
> CPU 0 that we felt we had to have a pair of irqdaemons in 2.6.

well irqbalanced is a userspace balancer

> Earlier APICs had a variation where the search for where each new
> interrupt was to go started with first cpu after the one that got the
> last interrupt.  If we call this 'round-robin' allocation, then today's
> technique could be described as 'first fit'.

if it's really busy it starves cpu0 .... 

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 2.6.1 and irq balancing
  2004-01-11  9:51 ` Arjan van de Ven
@ 2004-01-11 16:50   ` Joe Korty
  2004-01-11 18:19     ` Arjan van de Ven
  2004-01-15 11:43     ` Pavel Machek
  0 siblings, 2 replies; 18+ messages in thread
From: Joe Korty @ 2004-01-11 16:50 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: Ethan Weinstein, linux-kernel, William Lee Irwin III

On Sun, Jan 11, 2004 at 10:51:22AM +0100, Arjan van de Ven wrote:
> On Sun, 2004-01-11 at 00:14, Ethan Weinstein wrote:
> > Greetings all,
> > 
> > I upgraded my server to 2.6.1, and I'm finding I'm saddled with only 
> > interrupting on CPU0 again. 2.6.0 does this as well. This is the 
> > Supermicro X5DPL-iGM-O (E7501 chipset), 2 Xeons@2.4ghz HT enabled. 
> > /proc/cpuinfo is normal as per HT, displaying 4 cpus.
> 
> you should run the userspace irq balance daemon:
> http://people.redhat.com/arjanv/irqbalance/

I have long wondered what is so evil about most interrupts going to
CPU 0 that we felt we had to have a pair of irqdaemons in 2.6.  From my
(admittedly imperfect) experience, the APIC will route an interrupt to
CPU 1 if CPU 0 is busy with another interrupt, to CPU 2 if 0 and 1 are
so occupied, and so on.  I see no harm in this other than the strangely
lopsided /proc/interrupt displays, which I can live with.

Earlier APICs had a variation where the search for where each new
interrupt was to go started with first cpu after the one that got the
last interrupt.  If we call this 'round-robin' allocation, then today's
technique could be described as 'first fit'.

If I am wrong about this, I would dearly love to be corrected :)
Joe

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 2.6.1 and irq balancing
  2004-01-10 23:14 Ethan Weinstein
  2004-01-11  2:39 ` Ed Tomlinson
  2004-01-11  9:51 ` Arjan van de Ven
@ 2004-01-11 13:14 ` Martin Schlemmer
  2 siblings, 0 replies; 18+ messages in thread
From: Martin Schlemmer @ 2004-01-11 13:14 UTC (permalink / raw)
  To: Ethan Weinstein; +Cc: Linux Kernel Mailing Lists, William Lee Irwin III

[-- Attachment #1: Type: text/plain, Size: 1734 bytes --]

On Sun, 2004-01-11 at 01:14, Ethan Weinstein wrote:
> Greetings all,
> 
> I upgraded my server to 2.6.1, and I'm finding I'm saddled with only 
> interrupting on CPU0 again. 2.6.0 does this as well. This is the 
> Supermicro X5DPL-iGM-O (E7501 chipset), 2 Xeons@2.4ghz HT enabled. 
> /proc/cpuinfo is normal as per HT, displaying 4 cpus.
> 2.4.2(3|4) exhibited this behaviour as well, until I applied patches 
> from here: 
> http://www.hardrock.org/kernel/2.4.23/irqbalance-2.4.23-jb.patch, et al.
> 
> 
>             CPU0       CPU1       CPU2       CPU3
>    0:    1572323          0          0          0    IO-APIC-edge  timer
>    2:          0          0          0          0          XT-PIC  cascade
>    3:      23520          0          0          0    IO-APIC-edge  serial
>    8:          2          0          0          0    IO-APIC-edge  rtc
>    9:          0          0          0          0   IO-APIC-level  acpi
>   14:         10          0          0          0    IO-APIC-edge  ide0
>   16:         30          0          0          0   IO-APIC-level  sym53c8xx
>   22:       4162          0          0          0   IO-APIC-level  eth0
>   48:       7798          0          0          0   IO-APIC-level  aic79xx
>   49:       3385          0          0          0   IO-APIC-level  aic79xx
>   54:      17062          0          0          0   IO-APIC-level  eth1
> NMI:          0          0          0          0
> LOC:    1572002    1572251    1572250    1572243
> ERR:          0
> MIS:          0
> 
> 
> THey keyboard isn't working either, but we see the i8042..
> 
> serio: i8042 KBD port at 0x60,0x64 irq 1
> 

Are you running irqbalance ?


-- 
Martin Schlemmer

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 2.6.1 and irq balancing
  2004-01-11  3:38   ` Nick Piggin
@ 2004-01-11  9:52     ` Arjan van de Ven
  0 siblings, 0 replies; 18+ messages in thread
From: Arjan van de Ven @ 2004-01-11  9:52 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Ed Tomlinson, linux-kernel, Ethan Weinstein

[-- Attachment #1: Type: text/plain, Size: 780 bytes --]

On Sun, 2004-01-11 at 04:38, Nick Piggin wrote:
> Ed Tomlinson wrote:
> 
> >Hi,
> >
> >What is the load on the box when this is happening?  If its low think
> >this is optimal (for cache reasons).
> >  
> >
> 
> I'd rather see different interrupt sources run on different CPUs
> initially, which would help fairness a little bit, and should be
> more optimal with big interrupt loads.
> 
> 
> 0:      xxx1     0      0      0
> 1:      0     xxx2      0      0
> 2:      0        0   xxx3      0
> 3:      0        0      0   xxx4
> 
> This would delay the need for interrupt balancing in the case where
> 2 or more interrupts are heavily used.

this is what irqbalanced will do (but more inteligent than just using
the irq number as round robin seed).


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 2.6.1 and irq balancing
  2004-01-10 23:14 Ethan Weinstein
  2004-01-11  2:39 ` Ed Tomlinson
@ 2004-01-11  9:51 ` Arjan van de Ven
  2004-01-11 16:50   ` Joe Korty
  2004-01-11 13:14 ` Martin Schlemmer
  2 siblings, 1 reply; 18+ messages in thread
From: Arjan van de Ven @ 2004-01-11  9:51 UTC (permalink / raw)
  To: Ethan Weinstein; +Cc: linux-kernel, William Lee Irwin III

[-- Attachment #1: Type: text/plain, Size: 571 bytes --]

On Sun, 2004-01-11 at 00:14, Ethan Weinstein wrote:
> Greetings all,
> 
> I upgraded my server to 2.6.1, and I'm finding I'm saddled with only 
> interrupting on CPU0 again. 2.6.0 does this as well. This is the 
> Supermicro X5DPL-iGM-O (E7501 chipset), 2 Xeons@2.4ghz HT enabled. 
> /proc/cpuinfo is normal as per HT, displaying 4 cpus.

you should run the userspace irq balance daemon:
http://people.redhat.com/arjanv/irqbalance/

(or part of the kernel-utils package if you run a RH based distro; afaik
SuSE ships it too but I don't know in what package)

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 2.6.1 and irq balancing
  2004-01-11  2:39 ` Ed Tomlinson
  2004-01-11  3:38   ` Nick Piggin
@ 2004-01-11  5:19   ` Ethan Weinstein
  1 sibling, 0 replies; 18+ messages in thread
From: Ethan Weinstein @ 2004-01-11  5:19 UTC (permalink / raw)
  To: Ed Tomlinson; +Cc: linux-kernel, piggin

Ed Tomlinson wrote:
> Hi,
> 
> What is the load on the box when this is happening?  If its low think
> this is optimal (for cache reasons).
> 

Admittedly, the machine's load was not high when I took this sample. 
However, creating a great deal of load does not change these statistics 
at all.  Being that there are patches available for 2.4.x kernels to fix 
this, I don't think this at all by design, but what do I know? =)

2.6.0 running on a non-HT SMP machine I have (old Compaq proliant 
2xPentium2) does interrupt on all CPU's with "noirqbalance" bootparam.

Regarding the keyboard, I noticed something interesting

2.6.1-rc1 shows the i8042 in /proc/interrupts:

   1:       1871          0          0          0    IO-APIC-edge  i8042

(keyboard still does not work, though..)

2.6.1 final does not show this at all, and [kseriod] eats a constant 5% 
  CPU.  Something's awry =)


-Ethan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 2.6.1 and irq balancing
  2004-01-11  2:39 ` Ed Tomlinson
@ 2004-01-11  3:38   ` Nick Piggin
  2004-01-11  9:52     ` Arjan van de Ven
  2004-01-11  5:19   ` Ethan Weinstein
  1 sibling, 1 reply; 18+ messages in thread
From: Nick Piggin @ 2004-01-11  3:38 UTC (permalink / raw)
  To: Ed Tomlinson; +Cc: linux-kernel, Ethan Weinstein



Ed Tomlinson wrote:

>Hi,
>
>What is the load on the box when this is happening?  If its low think
>this is optimal (for cache reasons).
>  
>

I'd rather see different interrupt sources run on different CPUs
initially, which would help fairness a little bit, and should be
more optimal with big interrupt loads.


0:      xxx1     0      0      0
1:      0     xxx2      0      0
2:      0        0   xxx3      0
3:      0        0      0   xxx4

This would delay the need for interrupt balancing in the case where
2 or more interrupts are heavily used.



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 2.6.1 and irq balancing
  2004-01-10 23:14 Ethan Weinstein
@ 2004-01-11  2:39 ` Ed Tomlinson
  2004-01-11  3:38   ` Nick Piggin
  2004-01-11  5:19   ` Ethan Weinstein
  2004-01-11  9:51 ` Arjan van de Ven
  2004-01-11 13:14 ` Martin Schlemmer
  2 siblings, 2 replies; 18+ messages in thread
From: Ed Tomlinson @ 2004-01-11  2:39 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ethan Weinstein

Hi,

What is the load on the box when this is happening?  If its low think
this is optimal (for cache reasons).

Ed Tomlinson

On January 10, 2004 06:14 pm, Ethan Weinstein wrote:
> Greetings all,
> 
> I upgraded my server to 2.6.1, and I'm finding I'm saddled with only 
> interrupting on CPU0 again. 2.6.0 does this as well. This is the 
> Supermicro X5DPL-iGM-O (E7501 chipset), 2 Xeons@2.4ghz HT enabled. 
> /proc/cpuinfo is normal as per HT, displaying 4 cpus.
> 2.4.2(3|4) exhibited this behaviour as well, until I applied patches 
> from here: 
> http://www.hardrock.org/kernel/2.4.23/irqbalance-2.4.23-jb.patch, et al.
> 
> 
>             CPU0       CPU1       CPU2       CPU3
>    0:    1572323          0          0          0    IO-APIC-edge  timer
>    2:          0          0          0          0          XT-PIC  cascade
>    3:      23520          0          0          0    IO-APIC-edge  serial
>    8:          2          0          0          0    IO-APIC-edge  rtc
>    9:          0          0          0          0   IO-APIC-level  acpi
>   14:         10          0          0          0    IO-APIC-edge  ide0
>   16:         30          0          0          0   IO-APIC-level 
> sym53c8xx\r 22:       4162          0          0          0   IO-APIC-level 
> eth0 48:       7798          0          0          0   IO-APIC-level 
> aic79xx 49:       3385          0          0          0   IO-APIC-level 
> aic79xx 54:      17062          0          0          0   IO-APIC-level 
> eth1 NMI:          0          0          0          0
> LOC:    1572002    1572251    1572250    1572243
> ERR:          0
> MIS:          0
> 
> 
> THey keyboard isn't working either, but we see the i8042..
> 
> serio: i8042 KBD port at 0x60,0x64 irq 1
> 
> 
> -Ethan
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* 2.6.1 and irq balancing
@ 2004-01-10 23:14 Ethan Weinstein
  2004-01-11  2:39 ` Ed Tomlinson
                   ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Ethan Weinstein @ 2004-01-10 23:14 UTC (permalink / raw)
  To: linux-kernel; +Cc: William Lee Irwin III

Greetings all,

I upgraded my server to 2.6.1, and I'm finding I'm saddled with only 
interrupting on CPU0 again. 2.6.0 does this as well. This is the 
Supermicro X5DPL-iGM-O (E7501 chipset), 2 Xeons@2.4ghz HT enabled. 
/proc/cpuinfo is normal as per HT, displaying 4 cpus.
2.4.2(3|4) exhibited this behaviour as well, until I applied patches 
from here: 
http://www.hardrock.org/kernel/2.4.23/irqbalance-2.4.23-jb.patch, et al.


            CPU0       CPU1       CPU2       CPU3
   0:    1572323          0          0          0    IO-APIC-edge  timer
   2:          0          0          0          0          XT-PIC  cascade
   3:      23520          0          0          0    IO-APIC-edge  serial
   8:          2          0          0          0    IO-APIC-edge  rtc
   9:          0          0          0          0   IO-APIC-level  acpi
  14:         10          0          0          0    IO-APIC-edge  ide0
  16:         30          0          0          0   IO-APIC-level  sym53c8xx
  22:       4162          0          0          0   IO-APIC-level  eth0
  48:       7798          0          0          0   IO-APIC-level  aic79xx
  49:       3385          0          0          0   IO-APIC-level  aic79xx
  54:      17062          0          0          0   IO-APIC-level  eth1
NMI:          0          0          0          0
LOC:    1572002    1572251    1572250    1572243
ERR:          0
MIS:          0


THey keyboard isn't working either, but we see the i8042..

serio: i8042 KBD port at 0x60,0x64 irq 1


-Ethan

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2004-01-15 11:44 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-01-11 23:59 2.6.1 and irq balancing Nakajima, Jun
2004-01-12  4:42 ` Bill Davidsen
2004-01-12 14:06   ` Zwane Mwaikambo
2004-01-12 16:10   ` Martin J. Bligh
2004-01-13  6:50 ` Ethan Weinstein
2004-01-13  7:05   ` Nick Piggin
  -- strict thread matches above, loose matches on Subject: below --
2004-01-13  8:09 Nakajima, Jun
2004-01-13  7:57 Nakajima, Jun
2004-01-10 23:14 Ethan Weinstein
2004-01-11  2:39 ` Ed Tomlinson
2004-01-11  3:38   ` Nick Piggin
2004-01-11  9:52     ` Arjan van de Ven
2004-01-11  5:19   ` Ethan Weinstein
2004-01-11  9:51 ` Arjan van de Ven
2004-01-11 16:50   ` Joe Korty
2004-01-11 18:19     ` Arjan van de Ven
2004-01-15 11:43     ` Pavel Machek
2004-01-11 13:14 ` Martin Schlemmer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).