linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [patch] severe softirq handling performance bug, fix, 2.4.5
@ 2001-05-26 17:59 Ingo Molnar
  2001-05-26 19:33 ` [patch] softirq-2.4.5-A1 Ingo Molnar
                   ` (4 more replies)
  0 siblings, 5 replies; 22+ messages in thread
From: Ingo Molnar @ 2001-05-26 17:59 UTC (permalink / raw)
  To: linux-kernel; +Cc: Linus Torvalds, Alan Cox, David S. Miller, Alexey Kuznetsov

[-- Attachment #1: Type: TEXT/PLAIN, Size: 4035 bytes --]


i've been seeing really bad average TCP latencies on certain gigabit cards
(~300-400 microseconds instead of the expected 100-200 microseconds), ever
since softnet went into the main kernel, and never found a real
explanation for it, until today.

the problem always went away when i tried to use tcpdump or strace, so the
bug remained hidden and was hard to prove that it actually existed. (apart
from the bad lat_tcp numbers.) We found many related bugs, but this
problem remained. tcpdumps done on the network did not show any fault of
the TCP stack. The lat_tcp latencies fluctuated alot, but for certain
cards the latencies were stable, so i suspected some sort of hw problem.
The loopback networking device never showed these problems, which added to
the mystery.

the problem turned out to be a severe softirq handling bug in the x86
code.

background: soft interrupts were introduced as a generic kernel framework
around January 2000, as part of the softnet networking-rewrite, that
predated the final scalability rewrite of the Linux TCP/IP networking
code. Soft interrupts have unique semantics, they can be best described as
'IRQ-triggered atomic system calls'. (unlike bottom halves, soft-IRQs do
not preempt kernel code.)

soft-IRQs, like their name suggest, are used from device interrupts ('hard
interrupts') to trigger 'background' work related to interrupts. Soft-IRQs
are triggered per-CPU, and they are supposed to execute whenever nothing
else is done by the kernel on that particular CPU. Softirqs are executed
with interrupts enabled, so hard interrupts can re-enable them while they
are executing. do_softirq() is a kernel function that returns with IRQs
disabled and at this point it's guaranteed that there are no more pending
softirqs for this CPU.

this mechanizm was the intention, but not the reality. In two important
and frequently used code paths it was possible for an active soft-IRQ to
"go unnoticed": i measured as long as 140 milliseconds (!!!) latency
between softirq activation and softirq execution in certain cases. This is
obviously bad behavior.

the two error cases are:

 #1 hard-IRQ interrupts user-space code, activates softirq, and returns to
    user-space code

 #2 hard-IRQ interrupts the idle task, activates softirq and returns to
    the idle task.

category #1 is easy to fix, in entry.S we have to check active softirqs
not only the exception and ret-from-syscall cases, but also in the
IRQ-ret-to-userspace case.

category #2 is subtle, because the idle process is kernel code, so
returning to it we do not execute active softirqs. The main two types of
idle handlers both had a window do 'miss' softirq execution:

- the HLT-based default handler could be called after schedule()'s check
  for softirqs, but after enabling IRQs. In this case an interrupt handler
  has a window to activate a softirq and neither the IRQ return code, nor
  the idle loop would execute it immediately. The fix is to do a softirq
  check right before the safe_halt call.

- the idle-poll handler does not check for softirqs either, it now does
  this in every iteration.

with the attached softirq-2.4.5-A0 patch applied to vanilla 2.4.5, i see
picture-perfect lat_tcp latencies of 109 microseconds over real gigabit
network. I see very stable (and very good) TUX latencies as well. TCP
bandwidth got better as well, probably due to the caching-locality bonus
when executing softirqs right after hardirqs.

[I'd like to ask everyone who had TCP latency problems (or other
networking performance problems) to test 2.4.5 with this patch applied -
thanks!]

impact of the bug: all softirq-using code is affected, mostly networking.
The loopback net driver was not affected because it's not interrupt-based.
The bug went away due to strace or tcpdump because those two utilities
pumped system-calls into the system which 'fixed' the softirq handling
bug.

(other softirq-based code is the tasklet code, and the keyboard code is
using tasklets, so the keyboard code might be affected as well.)

	Ingo

[-- Attachment #2: Type: TEXT/PLAIN, Size: 1557 bytes --]

--- linux/arch/i386/kernel/entry.S.orig	Sat May 26 19:20:48 2001
+++ linux/arch/i386/kernel/entry.S	Sat May 26 19:21:52 2001
@@ -214,7 +214,6 @@
 #endif
 	jne   handle_softirq
 	
-ret_with_reschedule:
 	cmpl $0,need_resched(%ebx)
 	jne reschedule
 	cmpl $0,sigpending(%ebx)
@@ -275,7 +274,7 @@
 	movl EFLAGS(%esp),%eax		# mix EFLAGS and CS
 	movb CS(%esp),%al
 	testl $(VM_MASK | 3),%eax	# return to VM86 mode or non-supervisor?
-	jne ret_with_reschedule
+	jne ret_from_sys_call
 	jmp restore_all
 
 	ALIGN
--- linux/arch/i386/kernel/process.c.orig	Sat May 26 19:21:56 2001
+++ linux/arch/i386/kernel/process.c	Sat May 26 19:28:06 2001
@@ -79,8 +79,12 @@
  */
 static void default_idle(void)
 {
+	int this_cpu = smp_processor_id();
+
 	if (current_cpu_data.hlt_works_ok && !hlt_counter) {
 		__cli();
+		if (softirq_active(this_cpu) & softirq_mask(this_cpu))
+			do_softirq();
 		if (!current->need_resched)
 			safe_halt();
 		else
@@ -95,6 +99,7 @@
  */
 static void poll_idle (void)
 {
+	int this_cpu = smp_processor_id();
 	int oldval;
 
 	__sti();
@@ -104,14 +109,17 @@
 	 * run here:
 	 */
 	oldval = xchg(&current->need_resched, -1);
+	if (oldval)
+		return;
 
-	if (!oldval)
-		asm volatile(
-			"2:"
-			"cmpl $-1, %0;"
-			"rep; nop;"
-			"je 2b;"
-				: :"m" (current->need_resched));
+	while (current->need_resched == -1) {
+		if (softirq_active(this_cpu) & softirq_mask(this_cpu)) {
+			__cli();
+			do_softirq();
+			__sti();
+		}
+		asm volatile( "rep; nop;" );
+	}
 }
 
 /*

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [patch] softirq-2.4.5-A1
  2001-05-26 17:59 [patch] severe softirq handling performance bug, fix, 2.4.5 Ingo Molnar
@ 2001-05-26 19:33 ` Ingo Molnar
  2001-05-27 17:12   ` Andrea Arcangeli
  2001-05-26 23:55 ` [patch] severe softirq handling performance bug, fix, 2.4.5 David S. Miller
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 22+ messages in thread
From: Ingo Molnar @ 2001-05-26 19:33 UTC (permalink / raw)
  To: linux-kernel
  Cc: Linus Torvalds, Alan Cox, David S. Miller, Alexey Kuznetsov, arjanv

[-- Attachment #1: Type: TEXT/PLAIN, Size: 856 bytes --]


i've attached the next round of softirq-handling fixes.

correctness fixes:

 - check for softirqs in the signal return path(s)

 - make sure all the entry.S return paths check for softirqs with
   interrupts disabled, otherwise we can end up getting another softirq
   right after the test. (and this causes a missed softirq.)

 - add softirq handling to the ACPI and APM code (untested)

performance tweaks:

 - remove __cli() from idle-poll, it's not needed.

 - separate exception handling and irq-ret path to avoid double-checking
   for softirqs

with this -A1 patch applied, softirqs should be executed by the kernel at
the first possible place, and there should be no unlimited softirq
latencies anymore.

patch is against vanilla 2.4.5 (patch includes the previous fixes as
well). The patch compiles, boots and works just fine on x86 SMP.

	Ingo

[-- Attachment #2: Type: TEXT/PLAIN, Size: 4665 bytes --]

--- linux/drivers/acpi/cpu.c.orig	Sat May 26 21:12:01 2001
+++ linux/drivers/acpi/cpu.c	Sat May 26 21:14:39 2001
@@ -231,10 +231,14 @@
 	sleep_level = 1;
 	acpi_sleep_on_busmaster();
 	for (;;) {
+		int this_cpu = smp_processor_id();
 		unsigned long time;
 		unsigned long diff;
 
 		__cli();
+		if (softirq_active(this_cpu) & softirq_mask(this_cpu))
+			do_softirq();
+
 		if (current->need_resched)
 			goto out;
 		time = acpi_read_pm_timer();
--- linux/arch/i386/kernel/entry.S.orig	Thu Nov  9 02:09:50 2000
+++ linux/arch/i386/kernel/entry.S	Sat May 26 21:21:39 2001
@@ -133,6 +133,17 @@
 	movl $-8192, reg; \
 	andl %esp, reg
 
+#ifdef CONFIG_SMP
+# define CHECK_SOFTIRQ \
+	movl processor(%ebx),%eax; \
+	shll $CONFIG_X86_L1_CACHE_SHIFT,%eax; \
+	movl SYMBOL_NAME(irq_stat)(,%eax),%ecx; \
+	testl SYMBOL_NAME(irq_stat)+4(,%eax),%ecx
+#else
+# define CHECK_SOFTIRQ \
+	movl SYMBOL_NAME(irq_stat),%ecx; \
+	testl SYMBOL_NAME(irq_stat)+4,%ecx
+#endif
 ENTRY(lcall7)
 	pushfl			# We get a different stack layout with call gates,
 	pushl %eax		# which has to be cleaned up later..
@@ -203,18 +214,10 @@
 	call *SYMBOL_NAME(sys_call_table)(,%eax,4)
 	movl %eax,EAX(%esp)		# save the return value
 ENTRY(ret_from_sys_call)
-#ifdef CONFIG_SMP
-	movl processor(%ebx),%eax
-	shll $CONFIG_X86_L1_CACHE_SHIFT,%eax
-	movl SYMBOL_NAME(irq_stat)(,%eax),%ecx		# softirq_active
-	testl SYMBOL_NAME(irq_stat)+4(,%eax),%ecx	# softirq_mask
-#else
-	movl SYMBOL_NAME(irq_stat),%ecx		# softirq_active
-	testl SYMBOL_NAME(irq_stat)+4,%ecx	# softirq_mask
-#endif
-	jne   handle_softirq
+	cli
+	CHECK_SOFTIRQ
+	jne handle_softirq
 	
-ret_with_reschedule:
 	cmpl $0,need_resched(%ebx)
 	jne reschedule
 	cmpl $0,sigpending(%ebx)
@@ -230,6 +233,13 @@
 	jne v86_signal_return
 	xorl %edx,%edx
 	call SYMBOL_NAME(do_signal)
+#ifdef CONFIG_SMP
+	GET_CURRENT(%ebx)
+#endif
+	cli
+	CHECK_SOFTIRQ
+	je restore_all
+	call SYMBOL_NAME(do_softirq)
 	jmp restore_all
 
 	ALIGN
@@ -238,6 +248,13 @@
 	movl %eax,%esp
 	xorl %edx,%edx
 	call SYMBOL_NAME(do_signal)
+#ifdef CONFIG_SMP
+	GET_CURRENT(%ebx)
+#endif
+	cli
+	CHECK_SOFTIRQ
+	je restore_all
+	call SYMBOL_NAME(do_softirq)
 	jmp restore_all
 
 	ALIGN
@@ -260,22 +277,22 @@
 ret_from_exception:
 #ifdef CONFIG_SMP
 	GET_CURRENT(%ebx)
-	movl processor(%ebx),%eax
-	shll $CONFIG_X86_L1_CACHE_SHIFT,%eax
-	movl SYMBOL_NAME(irq_stat)(,%eax),%ecx		# softirq_active
-	testl SYMBOL_NAME(irq_stat)+4(,%eax),%ecx	# softirq_mask
-#else
-	movl SYMBOL_NAME(irq_stat),%ecx		# softirq_active
-	testl SYMBOL_NAME(irq_stat)+4,%ecx	# softirq_mask
 #endif
+	cli
+	CHECK_SOFTIRQ
 	jne   handle_softirq
+	cmpl $0,need_resched(%ebx)
+	jne reschedule
+	cmpl $0,sigpending(%ebx)
+	jne signal_return
+	jmp restore_all
 
 ENTRY(ret_from_intr)
 	GET_CURRENT(%ebx)
 	movl EFLAGS(%esp),%eax		# mix EFLAGS and CS
 	movb CS(%esp),%al
 	testl $(VM_MASK | 3),%eax	# return to VM86 mode or non-supervisor?
-	jne ret_with_reschedule
+	jne ret_from_sys_call
 	jmp restore_all
 
 	ALIGN
--- linux/arch/i386/kernel/process.c.orig	Sat May 26 20:54:30 2001
+++ linux/arch/i386/kernel/process.c	Sat May 26 21:17:05 2001
@@ -79,8 +79,12 @@
  */
 static void default_idle(void)
 {
+	int this_cpu = smp_processor_id();
+
 	if (current_cpu_data.hlt_works_ok && !hlt_counter) {
 		__cli();
+		if (softirq_active(this_cpu) & softirq_mask(this_cpu))
+			do_softirq();
 		if (!current->need_resched)
 			safe_halt();
 		else
@@ -95,6 +99,7 @@
  */
 static void poll_idle (void)
 {
+	int this_cpu = smp_processor_id();
 	int oldval;
 
 	__sti();
@@ -104,14 +109,16 @@
 	 * run here:
 	 */
 	oldval = xchg(&current->need_resched, -1);
+	if (oldval)
+		return;
 
-	if (!oldval)
-		asm volatile(
-			"2:"
-			"cmpl $-1, %0;"
-			"rep; nop;"
-			"je 2b;"
-				: :"m" (current->need_resched));
+	while (current->need_resched == -1) {
+		if (softirq_active(this_cpu) & softirq_mask(this_cpu)) {
+			do_softirq();
+			__sti();
+		}
+		asm volatile( "rep; nop;" );
+	}
 }
 
 /*
--- linux/arch/i386/kernel/apm.c.orig	Sat May 26 21:15:12 2001
+++ linux/arch/i386/kernel/apm.c	Sat May 26 21:15:58 2001
@@ -605,11 +605,15 @@
 	while (1) {
 		if (!current->need_resched) {
 			if (jiffies - start_idle < HARD_IDLE_TIMEOUT) {
+				int this_cpu = smp_processor_id();
+
 				if (!current_cpu_data.hlt_works_ok)
 					continue;
 				if (hlt_counter)
 					continue;
 				__cli();
+				if (softirq_active(this_cpu) & softirq_mask(this_cpu))
+					do_softirq();
 				if (!current->need_resched)
 					safe_halt();
 				else

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch] severe softirq handling performance bug, fix, 2.4.5
  2001-05-26 17:59 [patch] severe softirq handling performance bug, fix, 2.4.5 Ingo Molnar
  2001-05-26 19:33 ` [patch] softirq-2.4.5-A1 Ingo Molnar
@ 2001-05-26 23:55 ` David S. Miller
  2001-05-27  3:28   ` Albert D. Cahalan
                     ` (3 more replies)
  2001-05-27 17:07 ` [patch] severe softirq handling performance bug, fix, 2.4.5 Andrea Arcangeli
                   ` (2 subsequent siblings)
  4 siblings, 4 replies; 22+ messages in thread
From: David S. Miller @ 2001-05-26 23:55 UTC (permalink / raw)
  To: mingo; +Cc: linux-kernel, Linus Torvalds, Alan Cox, Alexey Kuznetsov


Ingo Molnar writes:
 > (unlike bottom halves, soft-IRQs do not preempt kernel code.)
 ...

Since when do we have this rule? :-)

 > the two error cases are:
 > 
 >  #1 hard-IRQ interrupts user-space code, activates softirq, and returns to
 >     user-space code
 > 
 >  #2 hard-IRQ interrupts the idle task, activates softirq and returns to
 >     the idle task.
 > 
 > category #1 is easy to fix, in entry.S we have to check active softirqs
 > not only the exception and ret-from-syscall cases, but also in the
 > IRQ-ret-to-userspace case.
 > 
 > category #2 is subtle, because the idle process is kernel code, so
 > returning to it we do not execute active softirqs. The main two types of
 > idle handlers both had a window do 'miss' softirq execution:

Ingo, I don't think this is the fix.

You should check Softirqs on return from every single IRQ.
In do_softirq() it will make sure that we won't run softirqs
while already doing so or being already nested in a hard-IRQ.

Every port works this way, I don't know where you got this "soft-IRQs
cannot run when returning to kernel code" rule, it simply doesn't
exist.

And looking at the x86 code, I don't even understand how your fixes
can make a difference, what about the do_softirq() call in
arch/i386/kernel/irq.c:do_IRQ()???  That should be taking care of all
these "error cases" right?

Later,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch] severe softirq handling performance bug, fix, 2.4.5
  2001-05-26 23:55 ` [patch] severe softirq handling performance bug, fix, 2.4.5 David S. Miller
@ 2001-05-27  3:28   ` Albert D. Cahalan
  2001-05-27 17:18   ` Andrea Arcangeli
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 22+ messages in thread
From: Albert D. Cahalan @ 2001-05-27  3:28 UTC (permalink / raw)
  To: David S. Miller; +Cc: mingo, linux-kernel

David S. Miller
> Ingo Molnar writes:

>> (unlike bottom halves, soft-IRQs do not preempt kernel code.)
> ...
>
> Since when do we have this rule? :-)
...
> You should check Softirqs on return from every single IRQ.
> In do_softirq() it will make sure that we won't run softirqs
> while already doing so or being already nested in a hard-IRQ.
> 
> Every port works this way, I don't know where you got this "soft-IRQs
> cannot run when returning to kernel code" rule, it simply doesn't
> exist.

After you two argue this out, please toss a note in Documentation.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch] severe softirq handling performance bug, fix, 2.4.5
  2001-05-26 17:59 [patch] severe softirq handling performance bug, fix, 2.4.5 Ingo Molnar
  2001-05-26 19:33 ` [patch] softirq-2.4.5-A1 Ingo Molnar
  2001-05-26 23:55 ` [patch] severe softirq handling performance bug, fix, 2.4.5 David S. Miller
@ 2001-05-27 17:07 ` Andrea Arcangeli
  2001-05-27 17:17 ` David S. Miller
  2001-06-10 10:40 ` Rusty Russell
  4 siblings, 0 replies; 22+ messages in thread
From: Andrea Arcangeli @ 2001-05-27 17:07 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Linus Torvalds, Alan Cox, David S. Miller,
	Alexey Kuznetsov

On Sat, May 26, 2001 at 07:59:28PM +0200, Ingo Molnar wrote:
> the two error cases are:
> 
>  #1 hard-IRQ interrupts user-space code, activates softirq, and returns to
>     user-space code

Before returning to userspace do_IRQ just runs do_softirq by hand from C
code.

>  #2 hard-IRQ interrupts the idle task, activates softirq and returns to
>     the idle task.

The problem only happens when we return to the idle task and a softirq
is been marked active again and we cannot keep running it or we risk to
soft deadlock.

I think the final fix for those issues (plus the case of a softirq
marking itself running all the time) is ksoftirqd, it lets the scheduler
to balance the softirq load, you can as well renice it. please try to
benchmark after applying it, it should solve all your troubles cleanly,
during tux load ksoftirq runs quite a lot btw:

	ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/kernels/v2.4/2.4.5aa1/00_ksoftirqd-4

don't forget to apply this scheduler fix first, or your risk to run into
troubles:

	ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/kernels/v2.4/2.4.5aa1/00_cpus_allowed-1

I also suggest to apply the other scheduler fixes like the sched_yield
and parent_timeslice too before running the test.

Andrea

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch] softirq-2.4.5-A1
  2001-05-26 19:33 ` [patch] softirq-2.4.5-A1 Ingo Molnar
@ 2001-05-27 17:12   ` Andrea Arcangeli
  2001-05-27 19:08     ` Ingo Molnar
  0 siblings, 1 reply; 22+ messages in thread
From: Andrea Arcangeli @ 2001-05-27 17:12 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Linus Torvalds, Alan Cox, David S. Miller,
	Alexey Kuznetsov, arjanv

On Sat, May 26, 2001 at 09:33:59PM +0200, Ingo Molnar wrote:
>    interrupts disabled, otherwise we can end up getting another softirq
>    right after the test. (and this causes a missed softirq.)

an irq that could mark the softirq active under entry.S will also run
do_softirq itself before iret to entry.S. If the softirq remains active
after an irq it it because it was marked active again under do_softirq
and ksoftirq is the way to go for fixing that case I think.

Andrea

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch] severe softirq handling performance bug, fix, 2.4.5
  2001-05-26 17:59 [patch] severe softirq handling performance bug, fix, 2.4.5 Ingo Molnar
                   ` (2 preceding siblings ...)
  2001-05-27 17:07 ` [patch] severe softirq handling performance bug, fix, 2.4.5 Andrea Arcangeli
@ 2001-05-27 17:17 ` David S. Miller
  2001-05-27 17:56   ` Andrea Arcangeli
  2001-05-27 19:09   ` David S. Miller
  2001-06-10 10:40 ` Rusty Russell
  4 siblings, 2 replies; 22+ messages in thread
From: David S. Miller @ 2001-05-27 17:17 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: Ingo Molnar, linux-kernel, Alan Cox, Alexey Kuznetsov


[ Linus removed from the CC:, he wouldn't read any of this since
  he's in Japan currently :-)]

Andrea Arcangeli writes:
 > On Sat, May 26, 2001 at 07:59:28PM +0200, Ingo Molnar wrote:
 > > the two error cases are:
 > > 
 > >  #1 hard-IRQ interrupts user-space code, activates softirq, and returns to
 > >     user-space code
 > 
 > Before returning to userspace do_IRQ just runs do_softirq by hand from C
 > code.

Ok, someone agrees with me. :-)

 > >  #2 hard-IRQ interrupts the idle task, activates softirq and returns to
 > >     the idle task.
 > 
 > The problem only happens when we return to the idle task and a softirq
 > is been marked active again and we cannot keep running it or we risk to
 > soft deadlock.

I still fail to understand, won't the C code in do_IRQ() handle
this case as well?  What is so special about returning from an
interrupt to the idle task on x86?  And what about that special'ness
makes the code at the end of do_IRQ() magically not run?

In fact, with the do_IRQ() check _and_ the check in schedule() itself,
the only case entry.S has to really deal with is "end of system call"
which it does.

Andrea, I think you are talking about a deeper and different problem.
Specifically, a softirq that makes new softirqs happen, or something
like this.  Right?

Later,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch] severe softirq handling performance bug, fix, 2.4.5
  2001-05-26 23:55 ` [patch] severe softirq handling performance bug, fix, 2.4.5 David S. Miller
  2001-05-27  3:28   ` Albert D. Cahalan
@ 2001-05-27 17:18   ` Andrea Arcangeli
  2001-05-27 19:00   ` [patch] softirq-2.4.5-B0 Ingo Molnar
  2001-05-27 19:15   ` David S. Miller
  3 siblings, 0 replies; 22+ messages in thread
From: Andrea Arcangeli @ 2001-05-27 17:18 UTC (permalink / raw)
  To: David S. Miller
  Cc: mingo, linux-kernel, Linus Torvalds, Alan Cox, Alexey Kuznetsov

On Sat, May 26, 2001 at 04:55:06PM -0700, David S. Miller wrote:
> And looking at the x86 code, I don't even understand how your fixes
> can make a difference, what about the do_softirq() call in
> arch/i386/kernel/irq.c:do_IRQ()???  That should be taking care of all
> these "error cases" right?

Yes, except when a softirq is marked running again under do_sofitrq (it
is mostly an issue when the machine is otherwise idle that means we
will waste cpu if we don't run the sofitrq immediatly, this problem was
noticed by Manfred last month, but that is just a special case of the
generic case of a softirq marked running again under do_softirq), and
all those cases are supposed to be taken care by ksoftirqd.

Andrea

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch] severe softirq handling performance bug, fix, 2.4.5
  2001-05-27 17:17 ` David S. Miller
@ 2001-05-27 17:56   ` Andrea Arcangeli
  2001-05-27 19:05     ` Ingo Molnar
  2001-05-27 19:09   ` David S. Miller
  1 sibling, 1 reply; 22+ messages in thread
From: Andrea Arcangeli @ 2001-05-27 17:56 UTC (permalink / raw)
  To: David S. Miller; +Cc: Ingo Molnar, linux-kernel, Alan Cox, Alexey Kuznetsov

On Sun, May 27, 2001 at 10:17:22AM -0700, David S. Miller wrote:
> I still fail to understand, won't the C code in do_IRQ() handle
> this case as well?  What is so special about returning from an
> interrupt to the idle task on x86?  And what about that special'ness
> makes the code at the end of do_IRQ() magically not run?

Nothing special of course, I don't like making it special infact,
everything is fine and nothing changes unless the underlined check
(part of the softirq patch) doesn't trigger:

	if (active) {
		struct softirq_action *h;

restart:
		/* Reset active bitmask before enabling irqs */
		softirq_active(cpu) &= ~active;

		local_irq_enable();

		h = softirq_vec;
		mask &= ~active;

		do {
			if (active & 1)
				h->action(h);
			h++;
			active >>= 1;
		} while (active);

		local_irq_disable();

		active = softirq_active(cpu);
		if ((active &= mask) != 0)
			goto retry;
	}
	if (softirq_active(cpu) & softirq_mask(cpu)) {
	    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

I mean everything is fine until the same softirq is marked active again
under do_softirq, in such case neither the do_softirq in do_IRQ will
run it (because we are in the critical section and we hold the per-cpu
locks), nor we will run it again ourself from the underlying do_softirq
to avoid live locking into do_softirq.

When the check triggers the ksoftirq patch just wakeup the kernel daemon
so the softirq flood will be balanced by the scheduler, possibly it
could keep running all the time if the machine is idle, like if we would
not mask &= ~active in the do_softirq, but without risks of live locks
because the scheduler will be fair.

	if (softirq_active(cpu) & softirq_mask(cpu)) {
		/*
		 * we cannot loop indefinitely here to avoid userspace starvation,
		 * but we also don't want to introduce a worst case 1/HZ latency
		 * to the pending events, so lets the scheduler to balance
		 * the irq load for us.
		 */
		struct task_struct * tsk = ksoftirqd_task(cpu);
		if (tsk && tsk->state != TASK_RUNNING)
			wake_up_process(tsk);
	}

> Andrea, I think you are talking about a deeper and different problem.

That is the exactly same problem pointed about by Ingo with the idle
task as far I can tell.

If the machine is idle waiting the next interrupt before running the
softirq is even worse because we definitely waste cpu, it's not only a
latency issue in that case, but the problematic is the same if the
machine is loaded and we don't run the softirq again because it was
marked active under us.

Andrea

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [patch] softirq-2.4.5-B0
  2001-05-26 23:55 ` [patch] severe softirq handling performance bug, fix, 2.4.5 David S. Miller
  2001-05-27  3:28   ` Albert D. Cahalan
  2001-05-27 17:18   ` Andrea Arcangeli
@ 2001-05-27 19:00   ` Ingo Molnar
  2001-05-27 19:15   ` David S. Miller
  3 siblings, 0 replies; 22+ messages in thread
From: Ingo Molnar @ 2001-05-27 19:00 UTC (permalink / raw)
  To: David S. Miller; +Cc: linux-kernel, Linus Torvalds, Alan Cox, Alexey Kuznetsov

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1531 bytes --]


On Sat, 26 May 2001, David S. Miller wrote:

> And looking at the x86 code, I don't even understand how your fixes
> can make a difference, what about the do_softirq() call in
> arch/i386/kernel/irq.c:do_IRQ()??? [...]

[you are right, it's a brain fart on my part. doh. i guess i was too happy
having fixed the longstanding latency problem.]

the TCP latency issues and the missed softirq execution bug is still
there, but for a slightly different reason.

the bug/misbehavior causing bad latencies turned out to be the following:
if a hardirq triggers a softirq, but syscall-level code on the same CPU
disabled local bhs via local_bh_disable(), then we 'miss' the execution of
the softirq, until the next IRQ. (or next direct call to do_softirq()).

the attached softirq-2.4.5-B0 patch fixes this problem by calling
do_softirq()  from local_bh_enable() [if the bh count is 0, to avoid
recursion]. This slightly changes local_bh_enable() semantics: calling
do_softirq() has the side-effect of disabling/enabling interrupts, so code
that used local_bh_enable while interrupts are disabled (and depended on
them staying disabled) will break. I checked all code that uses
local_bh_enable() via a debugging check, and the only (harmless) violation
of this new rule is machine_restart() in the x86 tree.

Yesterday's patches fix this problem too, but only as a lucky side-effect,
and only in the idle-poll case. 2.4.5 + softirq-2.4.5-B0 TCP latency is
down from a fluctuating 300-400 microseconds to a stable 109 microseconds.

	Ingo

[-- Attachment #2: Type: TEXT/PLAIN, Size: 1617 bytes --]

--- linux/kernel/softirq.c.orig	Sun May 27 20:57:36 2001
+++ linux/kernel/softirq.c	Sun May 27 20:57:52 2001
@@ -87,7 +87,7 @@
 			goto retry;
 	}
 
-	local_bh_enable();
+	__local_bh_enable();
 
 	/* Leave with locally disabled hard irqs. It is critical to close
 	 * window for infinite recursion, while we help local bh count,
--- linux/include/asm-i386/softirq.h.orig	Sun May 27 20:56:58 2001
+++ linux/include/asm-i386/softirq.h	Sun May 27 20:58:15 2001
@@ -5,10 +5,12 @@
 #include <asm/hardirq.h>
 
 #define cpu_bh_disable(cpu)	do { local_bh_count(cpu)++; barrier(); } while (0)
-#define cpu_bh_enable(cpu)	do { barrier(); local_bh_count(cpu)--; } while (0)
+#define __cpu_bh_enable(cpu)	do { barrier(); local_bh_count(cpu)--; } while (0)
+extern void cpu_bh_enable (unsigned int cpu);
 
 #define local_bh_disable()	cpu_bh_disable(smp_processor_id())
 #define local_bh_enable()	cpu_bh_enable(smp_processor_id())
+#define __local_bh_enable()	__cpu_bh_enable(smp_processor_id())
 
 #define in_softirq() (local_bh_count(smp_processor_id()) != 0)
 
--- linux/arch/i386/kernel/irq.c.orig	Sun May 27 20:55:08 2001
+++ linux/arch/i386/kernel/irq.c	Sun May 27 20:56:45 2001
@@ -628,6 +628,19 @@
 	return 1;
 }
 
+/*
+ * A bh-atomic section might have blocked the execution of softirqs.
+ * re-run them if appropriate.
+ */
+void cpu_bh_enable (unsigned int cpu)
+{
+	if (!--local_bh_count(cpu) &&
+			(softirq_active(cpu) & softirq_mask(cpu))) {
+		do_softirq();
+		__sti();
+	}
+}
+
 /**
  *	request_irq - allocate an interrupt line
  *	@irq: Interrupt line to allocate

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch] severe softirq handling performance bug, fix, 2.4.5
  2001-05-27 17:56   ` Andrea Arcangeli
@ 2001-05-27 19:05     ` Ingo Molnar
  2001-05-27 19:55       ` Andrea Arcangeli
  0 siblings, 1 reply; 22+ messages in thread
From: Ingo Molnar @ 2001-05-27 19:05 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: David S. Miller, linux-kernel, Alan Cox, Alexey Kuznetsov


On Sun, 27 May 2001, Andrea Arcangeli wrote:

> I mean everything is fine until the same softirq is marked active
> again under do_softirq, in such case neither the do_softirq in do_IRQ
> will run it (because we are in the critical section and we hold the
> per-cpu locks), nor we will run it again ourself from the underlying
> do_softirq to avoid live locking into do_softirq.

if you mean the stock kernel, this scenario you describe is not how it
behaves, because only IRQ contexts can mark a softirq active again. And
those IRQ contexts will run do_IRQ() naturally, so while *this*
do_softirq() invocation wont run those reactivated softirqs, the IRQ
context that just triggered the softirq will do so.

the real source of softirq latencies is the local_bh_disable()/enable()
behavior, see my previous patch.

	Ingo



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch] softirq-2.4.5-A1
  2001-05-27 17:12   ` Andrea Arcangeli
@ 2001-05-27 19:08     ` Ingo Molnar
  2001-05-27 20:05       ` Andrea Arcangeli
  0 siblings, 1 reply; 22+ messages in thread
From: Ingo Molnar @ 2001-05-27 19:08 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: linux-kernel, Linus Torvalds, Alan Cox, David S. Miller,
	Alexey Kuznetsov, arjanv


On Sun, 27 May 2001, Andrea Arcangeli wrote:

> an irq that could mark the softirq active under entry.S will also run
> do_softirq itself before iret to entry.S. [...]

yep.

> [...] If the softirq remains active after an irq it it because it was
> marked active again under do_softirq and ksoftirq is the way to go for
> fixing that case I think.

i took at look at your ksoftirq stuff yesterday, and i think it's
completely unnecessery and adds serious overhead to softirq handling. The
whole point of softirqs is to have maximum scalability and no
serialization. Why did you add ksoftirqd, would you mind explaining it?

	Ingo


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch] severe softirq handling performance bug, fix, 2.4.5
  2001-05-27 17:17 ` David S. Miller
  2001-05-27 17:56   ` Andrea Arcangeli
@ 2001-05-27 19:09   ` David S. Miller
  2001-05-27 20:24     ` Andrea Arcangeli
  1 sibling, 1 reply; 22+ messages in thread
From: David S. Miller @ 2001-05-27 19:09 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: Ingo Molnar, linux-kernel, Alan Cox, Alexey Kuznetsov


Andrea Arcangeli writes:
 > 	if (softirq_active(cpu) & softirq_mask(cpu)) {
 > 	    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 > 
 > I mean everything is fine until the same softirq is marked active again
 > under do_softirq, in such case neither the do_softirq in do_IRQ will
 > run it (because we are in the critical section and we hold the per-cpu
 > locks), nor we will run it again ourself from the underlying do_softirq
 > to avoid live locking into do_softirq.

"live lock".  What do you hope to avoid by pushing softirq processing
into a scheduled task?  I think doing that is a stupid idea.

You are saying that if we are getting a lot of soft irqs we should
defer it so that we can leave the trap handler, to avoid "live lock".

I think this is a bogus scheme for several reasons.  First of all,
deferring the processing will only increase the likelyhood that the
locality of the data will be lost, making the system work harder.

Secondly, if we are getting softirqs at such a rate, we have other
problems.  We are likely getting surged with hardware interrupts, and
until we have Jamals stuff in to move ethernet hardware interrupt
handling into softirqs your deferrals will be fruitless when they do
actually trigger.  We will be livelocked in hardware interrupt
processing instead of being livelocked in softirq processing, what an
incredible improvement. :-)

Therefore I recommend that the softirqs are implemented on x86 how at
least I intended the damn things to be implemented, on every return
from trap, no matter what kind, call do_softirq if softirqs are
pending.

Again, I am totally against ksoftirqd, I think it's a completely dumb
idea.  Softirqs were meant to be as light weight as possible, don't
crap on them like this with this heavyweight live lock "solution".
It isn't even solving live locks, it's rather trading one kind for
another with zero improvement.

Later,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch] softirq-2.4.5-B0
  2001-05-26 23:55 ` [patch] severe softirq handling performance bug, fix, 2.4.5 David S. Miller
                     ` (2 preceding siblings ...)
  2001-05-27 19:00   ` [patch] softirq-2.4.5-B0 Ingo Molnar
@ 2001-05-27 19:15   ` David S. Miller
  2001-05-27 19:19     ` Ingo Molnar
  3 siblings, 1 reply; 22+ messages in thread
From: David S. Miller @ 2001-05-27 19:15 UTC (permalink / raw)
  To: mingo; +Cc: linux-kernel, Alan Cox, Alexey Kuznetsov


Ingo Molnar writes:
 > the bug/misbehavior causing bad latencies turned out to be the following:
 > if a hardirq triggers a softirq, but syscall-level code on the same CPU
 > disabled local bhs via local_bh_disable(), then we 'miss' the execution of
 > the softirq, until the next IRQ. (or next direct call to do_softirq()).

Hooray, some sanity in this thread finally :-)

Yes, this makes perfect sense, this is indeed what can happen.

 > the attached softirq-2.4.5-B0 patch fixes this problem by calling
 > do_softirq()  from local_bh_enable() [if the bh count is 0, to avoid
 > recursion].

Yikes!  I do not like this fix.

I'd rather local_bh_enable() not become a more heavy primitive.

I know, in one respect it makes sense because it parallels how
hardware interrupts work, but not this thing is a function call
instead of a counter bump :-(

Any other ideas how to zap this?

Later,
David S. Miller
davem@redhat.com


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch] softirq-2.4.5-B0
  2001-05-27 19:15   ` David S. Miller
@ 2001-05-27 19:19     ` Ingo Molnar
  0 siblings, 0 replies; 22+ messages in thread
From: Ingo Molnar @ 2001-05-27 19:19 UTC (permalink / raw)
  To: David S. Miller; +Cc: linux-kernel, Alan Cox, Alexey Kuznetsov


On Sun, 27 May 2001, David S. Miller wrote:

> Hooray, some sanity in this thread finally :-)

[ finally i had some sleep after a really long debugging session :-| ]

>  > the attached softirq-2.4.5-B0 patch fixes this problem by calling
>  > do_softirq()  from local_bh_enable() [if the bh count is 0, to avoid
>  > recursion].
>
> Yikes!  I do not like this fix.

i think we have no choice, unfortunately.

and i think function calls are not that scary anymore, especially not with
regparms and similar compiler optimizations. The function is simple, the
function just goes in and returns in 90% of the cases, which should be
handled nicely by most BTBs.

we have other fundamental primitives that are a function call too, eg.
dget(), and they are used just as frequently. In 2.4 we were moving
inlined code into functions in a number of cases, and it appeared to work
out well in most cases.

> I'd rather local_bh_enable() not become a more heavy primitive.
>
> I know, in one respect it makes sense because it parallels how
> hardware interrupts work, but not this thing is a function call
> instead of a counter bump :-(

i believe the important thing is that the function has no serialization or
other 'heavy' stuff. BHs had the misdesign of not being restarted after
being re-enabled, and it caused performance problems - we should not
repeat history.

	Ingo


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch] severe softirq handling performance bug, fix, 2.4.5
  2001-05-27 19:05     ` Ingo Molnar
@ 2001-05-27 19:55       ` Andrea Arcangeli
  2001-05-27 21:00         ` Ingo Molnar
  0 siblings, 1 reply; 22+ messages in thread
From: Andrea Arcangeli @ 2001-05-27 19:55 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: David S. Miller, linux-kernel, Alan Cox, Alexey Kuznetsov

On Sun, May 27, 2001 at 09:05:50PM +0200, Ingo Molnar wrote:
> 
> On Sun, 27 May 2001, Andrea Arcangeli wrote:
> 
> > I mean everything is fine until the same softirq is marked active
> > again under do_softirq, in such case neither the do_softirq in do_IRQ
> > will run it (because we are in the critical section and we hold the
		 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > per-cpu locks), nor we will run it again ourself from the underlying
    ^^^^^^^^^^^^^
> > do_softirq to avoid live locking into do_softirq.
> 
> if you mean the stock kernel, this scenario you describe is not how it

Yes the stock kernel.

> behaves, because only IRQ contexts can mark a softirq active again. And
> those IRQ contexts will run do_IRQ() naturally, so while *this*
> do_softirq() invocation wont run those reactivated softirqs, the IRQ
> context that just triggered the softirq will do so.

it won't because the underlying do_softirq did local_bh_disable() and
the in_interrupt() check will cause the do_softirq from do_IRQ to return
immediatly.

Andrea

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch] softirq-2.4.5-A1
  2001-05-27 19:08     ` Ingo Molnar
@ 2001-05-27 20:05       ` Andrea Arcangeli
  2001-05-28  1:17         ` Horst von Brand
  0 siblings, 1 reply; 22+ messages in thread
From: Andrea Arcangeli @ 2001-05-27 20:05 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Linus Torvalds, Alan Cox, David S. Miller,
	Alexey Kuznetsov, arjanv

On Sun, May 27, 2001 at 09:08:51PM +0200, Ingo Molnar wrote:
> i took at look at your ksoftirq stuff yesterday, and i think it's
> completely unnecessery and adds serious overhead to softirq handling. The
> whole point of softirqs is to have maximum scalability and no
> serialization. Why did you add ksoftirqd, would you mind explaining it?

The only case ksoftirqd runs is when the stock kernel does the wrong
thing and potentially delays the softirq of 1/HZ. Nothing else.

When current kernel does the right thing ksoftirq cannot generate any
scalability problem and furthmore ksoftirqd is a per-cpu thing so if you
face a scalability problem with it that simply means you need to fix the
scheduler because then it means you would face a scalability issue as
well every time a tux thread calls schedule().

90% of the time ksoftirqd will never run, when it runs it means you want
to pay for a scheduler run to get it running. The price of the scheduler
is just the price for the logic that balance the softirq load in a fair
manner and without buggy latencies.

Andrea

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch] severe softirq handling performance bug, fix, 2.4.5
  2001-05-27 19:09   ` David S. Miller
@ 2001-05-27 20:24     ` Andrea Arcangeli
  0 siblings, 0 replies; 22+ messages in thread
From: Andrea Arcangeli @ 2001-05-27 20:24 UTC (permalink / raw)
  To: David S. Miller; +Cc: Ingo Molnar, linux-kernel, Alan Cox, Alexey Kuznetsov

On Sun, May 27, 2001 at 12:09:29PM -0700, David S. Miller wrote:
> "live lock".  What do you hope to avoid by pushing softirq processing
> into a scheduled task?  I think doing that is a stupid idea.

NOTE: I'm not pushing anything out of the atomic context, I'm using
ksoftirqd only to cure the cases that are getting wrong by the fast-path
code.

It fixes the 1/HZ latency with the idle task and it gets right the case
of softirq marked pending again under do_softirq.

> You are saying that if we are getting a lot of soft irqs we should
> defer it so that we can leave the trap handler, to avoid "live lock".

Yes, that is what all kernels out there are doing.

> I think this is a bogus scheme for several reasons.  First of all,
> deferring the processing will only increase the likelyhood that the
> locality of the data will be lost, making the system work harder.

Of course almost all the time the processing is not deferred.

> Secondly, if we are getting softirqs at such a rate, we have other
> problems.  We are likely getting surged with hardware interrupts, and
> until we have Jamals stuff in to move ethernet hardware interrupt
> handling into softirqs your deferrals will be fruitless when they do
> actually trigger.  We will be livelocked in hardware interrupt
> processing instead of being livelocked in softirq processing, what an
> incredible improvement. :-)

The sofitrq could be marked running also from another softirq, it
doesn't need to be an interrupt.

> from trap, no matter what kind, call do_softirq if softirqs are
> pending.

that just happens, I assume you mean you prefer to remove mask &=
~active from do_softirq() internally.

But doing that will hang in irq as soon as a softirq or tasklet or
that marks itself running again in software (and I think this happens
just now in some driver to poll the hardware from atomic context where
you cannot schedule).  Furthmore the softirq is going to take more time
than the irq core itself, so the live lock issue is not so obvious as
you say I think.

> Again, I am totally against ksoftirqd, I think it's a completely dumb
> idea.  Softirqs were meant to be as light weight as possible, don't
> crap on them like this with this heavyweight live lock "solution".
> It isn't even solving live locks, it's rather trading one kind for
> another with zero improvement.

softirq is still as light as possible but without ksoftirq the logic is
wrong in some case, and so it can get a seneless 1/HZ latency sometime.
ksofitrqd fixes all those broken cases in a clean manner.

I'd like to know how much it helps on the gigabit benchmarks. For the
100mbit ethernet that I can test locally it is fine.

Andrea

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch] severe softirq handling performance bug, fix, 2.4.5
  2001-05-27 19:55       ` Andrea Arcangeli
@ 2001-05-27 21:00         ` Ingo Molnar
  2001-05-28 19:26           ` kuznet
  0 siblings, 1 reply; 22+ messages in thread
From: Ingo Molnar @ 2001-05-27 21:00 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: David S. Miller, linux-kernel, Alan Cox, Alexey Kuznetsov


On Sun, 27 May 2001, Andrea Arcangeli wrote:

> Yes the stock kernel.

yep you are right.

i had this fixed too at a certain point, there is one subtle issue: under
certain circumstances tasklets re-activate the tasklet softirq(s) from
within the softirq handler, which leads to infinite loops if we just
naively restart softirq handling. This fix is not in the -B0 patch yet.

	Ingo


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch] softirq-2.4.5-A1
  2001-05-27 20:05       ` Andrea Arcangeli
@ 2001-05-28  1:17         ` Horst von Brand
  0 siblings, 0 replies; 22+ messages in thread
From: Horst von Brand @ 2001-05-28  1:17 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Ingo Molnar, linux-kernel, Alan Cox, David S. Miller,
	Alexey Kuznetsov, arjanv

[Deleted Linus from Cc:]
Andrea Arcangeli <andrea@suse.de> said:

[...]

> The only case ksoftirqd runs is when the stock kernel does the wrong
> thing and potentially delays the softirq of 1/HZ. Nothing else.

Thet get the kernel to do the right thing, don't paper over it.
-- 
Horst von Brand                             vonbrand@sleipnir.valparaiso.cl
Casilla 9G, Vin~a del Mar, Chile                               +56 32 672616

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch] severe softirq handling performance bug, fix, 2.4.5
  2001-05-27 21:00         ` Ingo Molnar
@ 2001-05-28 19:26           ` kuznet
  0 siblings, 0 replies; 22+ messages in thread
From: kuznet @ 2001-05-28 19:26 UTC (permalink / raw)
  To: mingo; +Cc: andrea, davem, linux-kernel, alan

Hello!

> yep you are right.
> 
> i had this fixed too at a certain point, there is one subtle issue: under
> certain circumstances tasklets re-activate the tasklet softirq(s) from
> within the softirq handler, which leads to infinite loops if we just
> naively restart softirq handling.

Exactly. That's why it is not made in this way. 8)

Actually, Andrea's patch (probably improved) is the only visible
direction to solve this.

>From the other hand note one thing: the problem is old like pyramides.
It was always present and it is much _lighter_ in 2.4 comparing
f.e. with 2.2, because in 2.4 at least different softirqs are served
with low latency. So, if you guys consider Andrea's solution too cumbersome,
consider at least adding check for pending softirqs in idle task
(f.e. recent patch by Manfred Spraul) and the things still will
be quite satisfactory.

BTW Ingo, probably you missed one different moment in TUX: schedule() points.
If you do not schedule for long time, all the engine stops to work.
F.e. local_bh_disable() made in thread context are legal when and
only when some schedule() or return from syscall will follow really soon.

Alexey


PS Sorry, I am still offline, but returning to life.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch] severe softirq handling performance bug, fix, 2.4.5
  2001-05-26 17:59 [patch] severe softirq handling performance bug, fix, 2.4.5 Ingo Molnar
                   ` (3 preceding siblings ...)
  2001-05-27 17:17 ` David S. Miller
@ 2001-06-10 10:40 ` Rusty Russell
  4 siblings, 0 replies; 22+ messages in thread
From: Rusty Russell @ 2001-06-10 10:40 UTC (permalink / raw)
  To: mingo, linux-kernel; +Cc: Alan Cox, David S. Miller, Alexey Kuznetsov

In message <Pine.LNX.4.33.0105261920030.3336-200000@localhost.localdomain> you 
write:
> i've been seeing really bad average TCP latencies on certain gigabit cards
> (~300-400 microseconds instead of the expected 100-200 microseconds), ever
> since softnet went into the main kernel, and never found a real
> explanation for it, until today.

The S/390 guys hit similar issues with the hotplug CPU patch, with
softirqs still pending on the downed CPU.

There are two cases when this happens:

(1) softirq's disabled when the interrupt came in.

(2) softirq scheduled within softirq, such as when networking is
    overloaded (net/core/dev.c's net_rx_action()).

Solving (2) is hard: the choices are to risk livelock (by resetting
the mask inside the softirq loop), or accept that bursty traffic may
have latencies of up to one timer tick.

Either way, we should solve (1) by checking in local_bh_enable() is
fine (despite Dave's reservations, and Alexey's "don't do that unless
you are about to schedule()" is obviously crap).  Then we can drop the
hackish checks inside schedule(), system calls returns and idle loops,
all of which were simply masking this problem.

Rusty.
--
Premature optmztion is rt of all evl. --DK

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2001-06-10 11:52 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-05-26 17:59 [patch] severe softirq handling performance bug, fix, 2.4.5 Ingo Molnar
2001-05-26 19:33 ` [patch] softirq-2.4.5-A1 Ingo Molnar
2001-05-27 17:12   ` Andrea Arcangeli
2001-05-27 19:08     ` Ingo Molnar
2001-05-27 20:05       ` Andrea Arcangeli
2001-05-28  1:17         ` Horst von Brand
2001-05-26 23:55 ` [patch] severe softirq handling performance bug, fix, 2.4.5 David S. Miller
2001-05-27  3:28   ` Albert D. Cahalan
2001-05-27 17:18   ` Andrea Arcangeli
2001-05-27 19:00   ` [patch] softirq-2.4.5-B0 Ingo Molnar
2001-05-27 19:15   ` David S. Miller
2001-05-27 19:19     ` Ingo Molnar
2001-05-27 17:07 ` [patch] severe softirq handling performance bug, fix, 2.4.5 Andrea Arcangeli
2001-05-27 17:17 ` David S. Miller
2001-05-27 17:56   ` Andrea Arcangeli
2001-05-27 19:05     ` Ingo Molnar
2001-05-27 19:55       ` Andrea Arcangeli
2001-05-27 21:00         ` Ingo Molnar
2001-05-28 19:26           ` kuznet
2001-05-27 19:09   ` David S. Miller
2001-05-27 20:24     ` Andrea Arcangeli
2001-06-10 10:40 ` Rusty Russell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).