linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH] i386: Fix softirq accounting with 4K stacks
@ 2006-06-26 22:54 Chuck Ebbert
  2006-06-27  0:57 ` Björn Steinbrink
  0 siblings, 1 reply; 11+ messages in thread
From: Chuck Ebbert @ 2006-06-26 22:54 UTC (permalink / raw)
  To: Bjorn Steinbrink; +Cc: linux-kernel

In-Reply-To: <20060625184244.GA11921@atjola.homenet>

On Sun, 25 Jun 2006 20:42:44 +0200, Bjorn Steinbrink wrote:

> Btw, which path do apic irqs go? I stumbled across the nmi stuff, but
> didn't see anything special for the apic irqs.

arch/i386/kernel/entry.S has the macro BUILD_INTERRUPT(name, nr).

The code that uses this macro is in arch/i386/mach-*/entry_arch.h.

The macro prepends "smp_" to the name passed to the macro and the
generated asm code calls that function after saving registers, etc.

arch/i386/kernel/apic.c::apic_intr_init() calls set_intr_gate() to
point some of the interrupt gates at the correct functions.

-- 
Chuck
 "You can't read a newspaper if you can't read."  --George W. Bush

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] i386: Fix softirq accounting with 4K stacks
  2006-06-26 22:54 [PATCH] i386: Fix softirq accounting with 4K stacks Chuck Ebbert
@ 2006-06-27  0:57 ` Björn Steinbrink
  0 siblings, 0 replies; 11+ messages in thread
From: Björn Steinbrink @ 2006-06-27  0:57 UTC (permalink / raw)
  To: Chuck Ebbert; +Cc: linux-kernel

On 2006.06.26 18:54:00 -0400, Chuck Ebbert wrote:
> In-Reply-To: <20060625184244.GA11921@atjola.homenet>
> 
> On Sun, 25 Jun 2006 20:42:44 +0200, Bjorn Steinbrink wrote:
> 
> > Btw, which path do apic irqs go? I stumbled across the nmi stuff, but
> > didn't see anything special for the apic irqs.
> 
> arch/i386/kernel/entry.S has the macro BUILD_INTERRUPT(name, nr).
> 
> The code that uses this macro is in arch/i386/mach-*/entry_arch.h.
> 
> The macro prepends "smp_" to the name passed to the macro and the
> generated asm code calls that function after saving registers, etc.
> 
> arch/i386/kernel/apic.c::apic_intr_init() calls set_intr_gate() to
> point some of the interrupt gates at the correct functions.

Thanks, my view on the code was too cscope-centric and the grep runs
searched for the full function name.
That finally explains why calling update_process_times() in
smp_local_timer_interrupt() makes a difference on UP.

Björn

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] i386: Fix softirq accounting with 4K stacks
  2006-06-26 17:58                               ` Björn Steinbrink
@ 2006-06-27 10:09                                 ` Mike Galbraith
  0 siblings, 0 replies; 11+ messages in thread
From: Mike Galbraith @ 2006-06-27 10:09 UTC (permalink / raw)
  To: Björn Steinbrink
  Cc: Arjan van de Ven, linux-kernel, danial_thom, Linus Torvalds,
	Andrew Morton

On Mon, 2006-06-26 at 19:58 +0200, Björn Steinbrink wrote:

> The results for the forcedeth driven NIC do not agree though, and your
> results differ from what I see as well, right? So I'm kinda lost again.

I'm not.  Not any more.  After much fruitless rummaging, I did some more
profiling to verify the numbers.  It really does take that much more cpu
when booted with noapic, nearly 50%!  I've been chasing swamp gas.

Oh well.  I got a warm fuzzy wrt tools cpu usage numbers out of it.

Stick a fork in this bug, it's done ;-)

	-Mike


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] i386: Fix softirq accounting with 4K stacks
  2006-06-26  3:05                             ` Mike Galbraith
@ 2006-06-26 17:58                               ` Björn Steinbrink
  2006-06-27 10:09                                 ` Mike Galbraith
  0 siblings, 1 reply; 11+ messages in thread
From: Björn Steinbrink @ 2006-06-26 17:58 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Arjan van de Ven, linux-kernel, danial_thom, Linus Torvalds,
	Andrew Morton

On 2006.06.26 05:05:14 +0200, Mike Galbraith wrote:
> On Mon, 2006-06-26 at 04:23 +0200, Mike Galbraith wrote:
> 
> > Something is certainly still b0rken.  I still get three different
> > answers to the question "what is my cpu usage" depending on
> > configuration.  With stock UP kernel with no IO-APIC, interrupt load is
> > all hi.  With your patch and IO-APIC, it's all si.  SMP shows a mix of
> > both.
> 
> I didn't say that quite right.
> 
> The third case for my box is XT-PIC and your patch.  Previously, top
> showed the 10% interrupt load of a flood ping as all hi.  Now it's both
> hi and si when using XT-PIC, but it's no longer the 10% that agreed with
> the profile, it's  10% hi, but with an added ~7% si.

I see the following here, when a box is being ping flooded:

UP/SMP  stack-size  PIC-type  hi si
-----------------------------------
UP      8K          IO-APIC  0  16
UP      8K          XT-PIC   7  7
UP      4K          IO-APIC  0  16
UP      4K          XT-PIC   7  7

SMP     8K          IO-APIC  5  4 (forcedeth)
SMP     8K          IO-APIC  0  11 (tg3)

The UP system is a Thinkpad which only has a tg3 driven NIC. The SMP
system is x86_64, so there's no 4K stacks to test with, maybe I'll fetch
a i386 live CD to do some more valid tests on SMP.

The UP tests seem to show the IO-APIC hardirq are completely deferred
to be handled in a softirq, as the sum of "hi" and "si" with XT-PIC is
about equal to the "si" value with IO-APIC (the numbers are guessed
averages of the observed values, so the 2% difference is not too be
taken too serious).
The results for the forcedeth driven NIC do not agree though, and your
results differ from what I see as well, right? So I'm kinda lost again.

Björn

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] i386: Fix softirq accounting with 4K stacks
  2006-06-26  2:23                           ` Mike Galbraith
@ 2006-06-26  3:05                             ` Mike Galbraith
  2006-06-26 17:58                               ` Björn Steinbrink
  0 siblings, 1 reply; 11+ messages in thread
From: Mike Galbraith @ 2006-06-26  3:05 UTC (permalink / raw)
  To: Björn Steinbrink
  Cc: Arjan van de Ven, linux-kernel, danial_thom, Linus Torvalds,
	Andrew Morton

On Mon, 2006-06-26 at 04:23 +0200, Mike Galbraith wrote:

> Something is certainly still b0rken.  I still get three different
> answers to the question "what is my cpu usage" depending on
> configuration.  With stock UP kernel with no IO-APIC, interrupt load is
> all hi.  With your patch and IO-APIC, it's all si.  SMP shows a mix of
> both.

I didn't say that quite right.

The third case for my box is XT-PIC and your patch.  Previously, top
showed the 10% interrupt load of a flood ping as all hi.  Now it's both
hi and si when using XT-PIC, but it's no longer the 10% that agreed with
the profile, it's  10% hi, but with an added ~7% si.

	-Mike


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] i386: Fix softirq accounting with 4K stacks
  2006-06-25 18:42                         ` Björn Steinbrink
@ 2006-06-26  2:23                           ` Mike Galbraith
  2006-06-26  3:05                             ` Mike Galbraith
  0 siblings, 1 reply; 11+ messages in thread
From: Mike Galbraith @ 2006-06-26  2:23 UTC (permalink / raw)
  To: Björn Steinbrink
  Cc: Arjan van de Ven, linux-kernel, danial_thom, Linus Torvalds,
	Andrew Morton

On Sun, 2006-06-25 at 20:42 +0200, Björn Steinbrink wrote:
> I just booted with both patches applied, mine and Mike's, and that
> actually makes a difference in hardirq cpu time accounting. With my
> patch only, hi is 0 in top while the box gets a ping flood. With both
> patches, I get about 1% hi. Mike's patch causes update_process_times()
> to be called twice on UP, but that alone shouldn't change the
> percentages, right?

Yes, you definitely need to comment out the other call if you test the
SMP path on UP+IO-APIC.

> OTOH top shows "hi" as zero with 8K stacks as well unless Mike's patch
> is applied, so the results with Mike's patch are bogus (if so, why?) or
> hardirq accounting is broken in general.

Something is certainly still b0rken.  I still get three different
answers to the question "what is my cpu usage" depending on
configuration.  With stock UP kernel with no IO-APIC, interrupt load is
all hi.  With your patch and IO-APIC, it's all si.  SMP shows a mix of
both.

I like the result of using the SMP path if you have an IO-APIC best,
though I haven't verified them against a profile for accuracy.  Taking a
peek at the profile confirms that it is indeed mixed, so anything
showing the load as being either hi or si has to be wrong.

	-Mike


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] i386: Fix softirq accounting with 4K stacks
  2006-06-25 17:43                       ` Arjan van de Ven
@ 2006-06-25 18:42                         ` Björn Steinbrink
  2006-06-26  2:23                           ` Mike Galbraith
  0 siblings, 1 reply; 11+ messages in thread
From: Björn Steinbrink @ 2006-06-25 18:42 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Mike Galbraith, linux-kernel, danial_thom, Linus Torvalds, Andrew Morton

On 2006.06.25 19:43:17 +0200, Arjan van de Ven wrote:
> On Sun, 2006-06-25 at 19:44 +0200, Mike Galbraith wrote:
> > On Sun, 2006-06-25 at 16:24 +0200, Björn Steinbrink wrote:
> > 
> > > Still no idea why your "fix" works, but the following patch also fixes
> > > the problem and is at least a little more like the RightThing.
> > 
> > Yeah.  I don't know about you, but I fully intend to blatantly ignore
> > that 'why' ;-)
> 
> the why is relatively easy ;)
> 
> since the "is a softirq executing" bit is on the stack, and each context
> (user, soft and hard irq) has their own stack, it's not automatic that
> the hardirq stack gets the softirq-executing flag... which your patch
> fixes.

That's mine, not Mike's. Mike's patch removed the #ifdef CONFIG_SMP
around update_process_times() in smp_local_timer_interrupt().

> NMI's and apic irqs generally don't go via the normal irq path and thus
> don't do a stack switch... so they don't lose the bit (for accounting
> purposes)

Hm, doesn't that mean that mean that hardirq accounting is still broken?
APIC irq comes in, increases hardirq count, then the timer irq fires,
switches the stack and looses the hardirq count that was incremented by
the APIC irq.

I just booted with both patches applied, mine and Mike's, and that
actually makes a difference in hardirq cpu time accounting. With my
patch only, hi is 0 in top while the box gets a ping flood. With both
patches, I get about 1% hi. Mike's patch causes update_process_times()
to be called twice on UP, but that alone shouldn't change the
percentages, right?
OTOH top shows "hi" as zero with 8K stacks as well unless Mike's patch
is applied, so the results with Mike's patch are bogus (if so, why?) or
hardirq accounting is broken in general.

Btw, which path do apic irqs go? I stumbled across the nmi stuff, but
didn't see anything special for the apic irqs.

Björn

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] i386: Fix softirq accounting with 4K stacks
  2006-06-25 14:24                   ` [PATCH] i386: Fix softirq accounting with 4K stacks Björn Steinbrink
  2006-06-25 15:15                     ` Arjan van de Ven
@ 2006-06-25 17:44                     ` Mike Galbraith
  2006-06-25 17:43                       ` Arjan van de Ven
  1 sibling, 1 reply; 11+ messages in thread
From: Mike Galbraith @ 2006-06-25 17:44 UTC (permalink / raw)
  To: Björn Steinbrink
  Cc: linux-kernel, danial_thom, Linus Torvalds, Andrew Morton

On Sun, 2006-06-25 at 16:24 +0200, Björn Steinbrink wrote:

> Still no idea why your "fix" works, but the following patch also fixes
> the problem and is at least a little more like the RightThing.

Yeah.  I don't know about you, but I fully intend to blatantly ignore
that 'why' ;-)

	-Mike


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] i386: Fix softirq accounting with 4K stacks
  2006-06-25 17:44                     ` Mike Galbraith
@ 2006-06-25 17:43                       ` Arjan van de Ven
  2006-06-25 18:42                         ` Björn Steinbrink
  0 siblings, 1 reply; 11+ messages in thread
From: Arjan van de Ven @ 2006-06-25 17:43 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Björn Steinbrink, linux-kernel, danial_thom, Linus Torvalds,
	Andrew Morton

On Sun, 2006-06-25 at 19:44 +0200, Mike Galbraith wrote:
> On Sun, 2006-06-25 at 16:24 +0200, Björn Steinbrink wrote:
> 
> > Still no idea why your "fix" works, but the following patch also fixes
> > the problem and is at least a little more like the RightThing.
> 
> Yeah.  I don't know about you, but I fully intend to blatantly ignore
> that 'why' ;-)

the why is relatively easy ;)

since the "is a softirq executing" bit is on the stack, and each context
(user, soft and hard irq) has their own stack, it's not automatic that
the hardirq stack gets the softirq-executing flag... which your patch
fixes.

NMI's and apic irqs generally don't go via the normal irq path and thus
don't do a stack switch... so they don't lose the bit (for accounting
purposes)



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] i386: Fix softirq accounting with 4K stacks
  2006-06-25 14:24                   ` [PATCH] i386: Fix softirq accounting with 4K stacks Björn Steinbrink
@ 2006-06-25 15:15                     ` Arjan van de Ven
  2006-06-25 17:44                     ` Mike Galbraith
  1 sibling, 0 replies; 11+ messages in thread
From: Arjan van de Ven @ 2006-06-25 15:15 UTC (permalink / raw)
  To: Björn Steinbrink
  Cc: linux-kernel, Mike Galbraith, danial_thom, Linus Torvalds, Andrew Morton

On Sun, 2006-06-25 at 16:24 +0200, Björn Steinbrink wrote:
> Copy the softirq bits in preempt_count from the current context into the
> hardirq context when using 4K stacks to make the softirq_count macro work
> correctly and thereby fix softirq cpu time accounting.
> 
> Signed-off-by: Björn Steinbrink <B.Steinbrink@gmx.de>
> 
> diff -Nurp a/arch/i386/kernel/irq.c b/arch/i386/kernel/irq.c
> --- a/arch/i386/kernel/irq.c	2006-03-20 06:53:29.000000000 +0100
> +++ b/arch/i386/kernel/irq.c	2006-06-25 15:49:52.000000000 +0200
> @@ -95,6 +95,14 @@ fastcall unsigned int do_IRQ(struct pt_r
>  		irqctx->tinfo.task = curctx->tinfo.task;
>  		irqctx->tinfo.previous_esp = current_stack_pointer;
>  
> +		/*
> +		 * Copy the softirq bits in preempt_count so that the
> +		 * softirq checks work in the hardirq context.
> +		 */
> +		irqctx->tinfo.preempt_count =
> +			irqctx->tinfo.preempt_count & ~SOFTIRQ_MASK |
> +			curctx->tinfo.preempt_count & SOFTIRQ_MASK;
> +
>  		asm volatile(
>  			"       xchgl   %%ebx,%%esp      \n"
>  			"       call    __do_IRQ         \n"

Hi,

at first I got nervous about the asymmetry of this (eg why only do this
copying only on entry, and not a copy back on exit)... but then again
these bits shouldn't change so your patch is ok as is...
it's regrettable that we need to add this for the softirq accounting;
part of me wishes we would just count irq-vs-user time and be done with
it ;)


Acked-by: Arjan van de Ven <arjan@linux.intel.com>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH] i386: Fix softirq accounting with 4K stacks
  2006-06-25 11:12                 ` Björn Steinbrink
@ 2006-06-25 14:24                   ` Björn Steinbrink
  2006-06-25 15:15                     ` Arjan van de Ven
  2006-06-25 17:44                     ` Mike Galbraith
  0 siblings, 2 replies; 11+ messages in thread
From: Björn Steinbrink @ 2006-06-25 14:24 UTC (permalink / raw)
  To: linux-kernel; +Cc: Mike Galbraith, danial_thom, Linus Torvalds, Andrew Morton

On 2006.06.25 13:12:38 +0200, Björn Steinbrink wrote:
> On 2006.06.25 07:06:33 +0200, Mike Galbraith wrote:
> > On Sat, 2006-06-24 at 21:25 +0200, Björn Steinbrink wrote:
> > > OK, it also depends on IO APIC being enabled and active, ie. noapic on
> > > the kernel command line will fix it as well as disabling
> > > CONFIG_X86_IO_APIC. That doesn't help me at all to understand why it
> > > happens though.
> > 
> > Ditto.
> > 
> > > The only difference with IO APIC disabled seems to be that the irq
> > > doesn't get ACKed before update_process_times() gets called.
> > > And your "fix" makes it being called outside of the xtime_lock spinlock
> > > and with a slightly different stack usage AFAICT.
> > 
> > (it's still under the xtime lock)
> 
> No, with IO-APIC enabled, it's using the local APIC timer, thus
> using_apic_timer is 1 and the path right after unlocking in
> timer_interrupt() is taken towards update_process_times().
> 
> > > But none of these should make a difference, right?
> > 
> > Not that I can see, but then it's pretty dark down here.  Anybody got a
> > flashlight I can borrow? ;-)
> 
> Guess I found one, not sure if it works correctly though ;)
> 
> Using 4K stacks, we have one stack for hard irqs and one for soft irqs,
> both having their own threadinfo and thus preemptcount. Thus the call to
> softirq_count() in account_system_time() will always return 0 when
> called in hard irq context. Additionally preemptcount is always set to
> HARDIRQ_OFFSET in hard irq context, so
> hardirq_count() - hardirq_offset is 0 all the time as well.
> 
> But that doesn't fit the fact that I at least get hard irq accounting
> when booting with noapic. And it also doesn't explain why your fix
> works, fixing both, soft and hard irq accounting. Am I missing some
> code path that calls smp_local_timer_interrupt? There's
> smp_apic_timer_interrupt(), but that seems to be unused on i386.

OK, the hardirq_count() thing is not HARDIRQ_OFFSET all the time. When
we interrupt a hard irq, we are already using the hard irq stack and
thus irq_enter() in do_IRQ() works as it should. The zero value for hi
is there with IO-APIC and 8K stacks as well, so that's either right or
an other bug.

Still no idea why your "fix" works, but the following patch also fixes
the problem and is at least a little more like the RightThing.


---

Copy the softirq bits in preempt_count from the current context into the
hardirq context when using 4K stacks to make the softirq_count macro work
correctly and thereby fix softirq cpu time accounting.

Signed-off-by: Björn Steinbrink <B.Steinbrink@gmx.de>

diff -Nurp a/arch/i386/kernel/irq.c b/arch/i386/kernel/irq.c
--- a/arch/i386/kernel/irq.c	2006-03-20 06:53:29.000000000 +0100
+++ b/arch/i386/kernel/irq.c	2006-06-25 15:49:52.000000000 +0200
@@ -95,6 +95,14 @@ fastcall unsigned int do_IRQ(struct pt_r
 		irqctx->tinfo.task = curctx->tinfo.task;
 		irqctx->tinfo.previous_esp = current_stack_pointer;
 
+		/*
+		 * Copy the softirq bits in preempt_count so that the
+		 * softirq checks work in the hardirq context.
+		 */
+		irqctx->tinfo.preempt_count =
+			irqctx->tinfo.preempt_count & ~SOFTIRQ_MASK |
+			curctx->tinfo.preempt_count & SOFTIRQ_MASK;
+
 		asm volatile(
 			"       xchgl   %%ebx,%%esp      \n"
 			"       call    __do_IRQ         \n"

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2006-06-27 10:04 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-06-26 22:54 [PATCH] i386: Fix softirq accounting with 4K stacks Chuck Ebbert
2006-06-27  0:57 ` Björn Steinbrink
  -- strict thread matches above, loose matches on Subject: below --
2006-06-24  5:59 Measuring tools - top and interrupts Mike Galbraith
2006-06-24  6:26 ` Mike Galbraith
2006-06-24  9:21   ` Björn Steinbrink
2006-06-24  9:51     ` Mike Galbraith
2006-06-24 11:41       ` Mike Galbraith
2006-06-24 15:40         ` Björn Steinbrink
2006-06-24 16:23           ` Mike Galbraith
2006-06-24 19:25             ` Björn Steinbrink
2006-06-25  5:06               ` Mike Galbraith
2006-06-25 11:12                 ` Björn Steinbrink
2006-06-25 14:24                   ` [PATCH] i386: Fix softirq accounting with 4K stacks Björn Steinbrink
2006-06-25 15:15                     ` Arjan van de Ven
2006-06-25 17:44                     ` Mike Galbraith
2006-06-25 17:43                       ` Arjan van de Ven
2006-06-25 18:42                         ` Björn Steinbrink
2006-06-26  2:23                           ` Mike Galbraith
2006-06-26  3:05                             ` Mike Galbraith
2006-06-26 17:58                               ` Björn Steinbrink
2006-06-27 10:09                                 ` Mike Galbraith

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).