linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Race condition in 2.4 tasklet handling
@ 2003-08-23  2:54 TeJun Huh
  2003-08-23  4:09 ` Race condition in 2.4 tasklet handling (cli() broken?) TeJun Huh
  2003-08-23 15:04 ` Race condition in 2.4 tasklet handling Anton Blanchard
  0 siblings, 2 replies; 13+ messages in thread
From: TeJun Huh @ 2003-08-23  2:54 UTC (permalink / raw)
  To: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1215 bytes --]

 It's a similar to race condition spotted in i386 interrupt code.  The
race exists between tasklet_[hi_]action() and tasklet_disable().
Again, memory-ordered synchronization is used between
tasklet_struct.count and tasklet_struct.state.

 tasklet_disable() is find because there's an smp_mb() at the end of
tasklet_disable_nosync(); however, in tasklet_action(), there is no
mb() between tasklet_trylock(t) and atomic_read(&t->count).  This
won't cause any trouble on architectures which orders memory accesses
around atomic operations such (including x86), but on architectures
which don't, a tasklet can be executing on another cpu on return from
tasklet_disable().

 Adding smp_mb__after_test_and_set_bit() at the end of
tasklet_trylock() should remedy the situation.  As
smp_mb__{before|after}_test_and_set_bit() don't exist yet, I'm
attaching a patch which adds smp_mb__after_clear_bit().  The patch is
against 2.4.21.

P.S. Please comment on the addition of
smp_mb__{before|after}_test_and_set_bit().

P.P.S. One thing I don't really understand is the uses of smp_mb() at
the end of tasklet_disable() and smp_mb__before_atomic_dec() inside
tasklet_enable().  Can anybody tell me what those are for?

--
tejun

[-- Attachment #2: patch.taskletrace --]
[-- Type: text/plain, Size: 1020 bytes --]

# This is a BitKeeper generated patch for the following project:
# Project Name: linux
# This patch format is intended for GNU patch command version 2.5 or higher.
# This patch includes the following deltas:
#	           ChangeSet	1.3     -> 1.4    
#	include/linux/interrupt.h	1.1     -> 1.2    
#
# The following is the BitKeeper ChangeSet Log
# --------------------------------------------
# 03/08/23	tj@atj.dyndns.org	1.4
# - tasklet race fix.
# --------------------------------------------
#
diff -Nru a/include/linux/interrupt.h b/include/linux/interrupt.h
--- a/include/linux/interrupt.h	Sat Aug 23 11:52:03 2003
+++ b/include/linux/interrupt.h	Sat Aug 23 11:52:03 2003
@@ -134,7 +134,10 @@
 #ifdef CONFIG_SMP
 static inline int tasklet_trylock(struct tasklet_struct *t)
 {
-	return !test_and_set_bit(TASKLET_STATE_RUN, &(t)->state);
+	int ret;
+	ret = !test_and_set_bit(TASKLET_STATE_RUN, &(t)->state);
+	smp_mb__after_clear_bit();
+	return ret;
 }
 
 static inline void tasklet_unlock(struct tasklet_struct *t)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Race condition in 2.4 tasklet handling (cli() broken?)
  2003-08-23  2:54 Race condition in 2.4 tasklet handling TeJun Huh
@ 2003-08-23  4:09 ` TeJun Huh
  2003-08-23  5:26   ` TeJun Huh
  2003-08-23 15:04 ` Race condition in 2.4 tasklet handling Anton Blanchard
  1 sibling, 1 reply; 13+ messages in thread
From: TeJun Huh @ 2003-08-23  4:09 UTC (permalink / raw)
  To: linux-kernel

 Additional suspicious things.

1. tasklet_kill() has similar race condition.  mb() required before
tasklet_unlock_wait().

2. local_bh_count() and global_bh_lock tests inside wait_on_irq()
suggests that cli() tries to block not only interrupt handling but all
softirq handlings of all cpus; however, current implementation does
not guarantee that.

 Because local_bh_count is adjusted in do_softirq() _after_
decrementing local_irq_count(), other cpus may happily begin
softirq/tasklet/bh handling while a cpu is inside cli() - sti()
critical section.

 If softirq handling is not guaranteed to be blocked during cli() -
sti() critical section, local_bh_count() and global_bh_lock tests
inside wait_on_irq() are redundant, and if it should be guranteed,
current implementation seems broken.

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Race condition in 2.4 tasklet handling (cli() broken?)
  2003-08-23  4:09 ` Race condition in 2.4 tasklet handling (cli() broken?) TeJun Huh
@ 2003-08-23  5:26   ` TeJun Huh
  2003-08-23 10:28     ` Stephan von Krawczynski
  0 siblings, 1 reply; 13+ messages in thread
From: TeJun Huh @ 2003-08-23  5:26 UTC (permalink / raw)
  To: linux-kernel

 Oops, Sorry.  Only bh handling is relevant.  Softirq and tasklet are
not of concern to cli().

On Sat, Aug 23, 2003 at 01:09:31PM +0900, TeJun Huh wrote:
>  Additional suspicious things.
> 
> 1. tasklet_kill() has similar race condition.  mb() required before
> tasklet_unlock_wait().

Corrected 2.

 global_bh_lock test inside wait_on_irq() suggests that cli() tries to
block not only interrupt handling but also bh handlings of all cpus;
however, current implementation does not guarantee that.

 Because global_bh_lock is acquired in bh_action() <call trace:
handle_IRQ_event()->do_softirq()->tasklet_action()->bh_action()> after
decrementing local_irq_count(), other cpus may happily begin bh
handling while a cpu is still inside cli() - sti() critical section.

 If bh hanlding is not guaranteed to be blocked during cli() - sti()
critical section, global_bh_lock test inside wait_on_irq() is
redundant and if it should be guaranteed, current implmentation seems
broken.

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Race condition in 2.4 tasklet handling (cli() broken?)
  2003-08-23  5:26   ` TeJun Huh
@ 2003-08-23 10:28     ` Stephan von Krawczynski
  2003-08-23 15:13       ` TeJun Huh
  0 siblings, 1 reply; 13+ messages in thread
From: Stephan von Krawczynski @ 2003-08-23 10:28 UTC (permalink / raw)
  To: TeJun Huh; +Cc: linux-kernel

On Sat, 23 Aug 2003 14:26:34 +0900
TeJun Huh <tejun@aratech.co.kr> wrote:

>  Oops, Sorry.  Only bh handling is relevant.  Softirq and tasklet are
> not of concern to cli().
> 
> On Sat, Aug 23, 2003 at 01:09:31PM +0900, TeJun Huh wrote:
> >  Additional suspicious things.
> > 
> > 1. tasklet_kill() has similar race condition.  mb() required before
> > tasklet_unlock_wait().
> 
> Corrected 2.
> [...]
>  If bh hanlding is not guaranteed to be blocked during cli() - sti()
> critical section, global_bh_lock test inside wait_on_irq() is
> redundant and if it should be guaranteed, current implmentation seems
> broken.

If we follow your analysis and say it is broken, do you have a suggestion/patch
how to fix both? I am willing to try your proposals, as it seems I am one of
very few who really experience stability issues on SMP with the current
implementation.

Regards,
Stephan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Race condition in 2.4 tasklet handling
  2003-08-23  2:54 Race condition in 2.4 tasklet handling TeJun Huh
  2003-08-23  4:09 ` Race condition in 2.4 tasklet handling (cli() broken?) TeJun Huh
@ 2003-08-23 15:04 ` Anton Blanchard
  1 sibling, 0 replies; 13+ messages in thread
From: Anton Blanchard @ 2003-08-23 15:04 UTC (permalink / raw)
  To: linux-kernel


Hi,

>  Adding smp_mb__after_test_and_set_bit() at the end of
> tasklet_trylock() should remedy the situation.  As
> smp_mb__{before|after}_test_and_set_bit() don't exist yet, I'm
> attaching a patch which adds smp_mb__after_clear_bit().  The patch is
> against 2.4.21.

No, the atomic and bitop operations that return values (like
test_and_set_bit) must have barriers. Is x86 missing these?

> P.S. Please comment on the addition of
> smp_mb__{before|after}_test_and_set_bit().

Only the atomic and bitops that dont return values (set_bit, atomic_inc
etc) might require ordering depending on use.

Anton

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Race condition in 2.4 tasklet handling (cli() broken?)
  2003-08-23 10:28     ` Stephan von Krawczynski
@ 2003-08-23 15:13       ` TeJun Huh
  2003-08-23 15:37         ` Stephan von Krawczynski
  2003-08-23 15:56         ` Stephan von Krawczynski
  0 siblings, 2 replies; 13+ messages in thread
From: TeJun Huh @ 2003-08-23 15:13 UTC (permalink / raw)
  To: Stephan von Krawczynski; +Cc: linux-kernel

On Sat, Aug 23, 2003 at 12:28:13PM +0200, Stephan von Krawczynski wrote:
> 
> If we follow your analysis and say it is broken, do you have a suggestion/patch
> how to fix both? I am willing to try your proposals, as it seems I am one of
> very few who really experience stability issues on SMP with the current
> implementation.
> 

 Hello, Stephan.

 The race conditions I'm mentioning in this thread are not likely to
cause real troubles.  The first one does not make any difference on
x86, and AFAIK bh isn't used extensively anymore so the second one
isn't very relevant either.  Only the race condition mentioned in the
other thread is of relvance if there is any :-(.

 We've been also suffering from random lockups (2.4.21 with various
patches including in-kernel irqbalancing) which show symptoms somewhat
different from usual kernel deadlock or panics.  We've seen lock ups
on several different machines.  All the machines were SMP and quite
busy with high volume network traffic and a lot of disk I/Os.  A
lockup takes from a week to a month(!) to take place.  Even though
they take very long, they occur sort of reliably.

 I had a chance to examine a locked up machine (Dual 3g xeon w/HT).  I
could turn on and off keyboard LEDs (so, keyboard irq is working) but
console didn't come back from blanked state.  The kernel was compiled
with sysrq and I've tried many sysrqs but teh console remained blank.
After a while, I pressed sysrq reboot key and it rebooted.
Fortunately, kernel log file did contain all outputs from sysrqs and I
could do a little bit of post-mortem analysis.

 The first weird thing was the timestamps.  Time seemed to be stopped
for a few hours between the lock up and the first sysrq request.
Then, time start to go again after the first sysrq request.  (NMI
watchdog was on)

 Process list showed that a server process is stuck inside kernel, but
the stuck position was very weird.  It was freeing a socket after
receving FIN.  The eip was stuck at the same place over several
sysrqs, and the instruction at the eip was plain ADD right after a
CALL instruction to kfree.  I think there is one more frame above
what's shown but I don't know how sysrq prints stack trace for other
cpus so I'm not sure.

 To gather more information, we hooked up a machine with kdb and are
waiting for the lockup to occur again.  My personal feeling is that
the race conditions I've mentioned are not the causes of the lockups
we're suffering from.

 It would be helpful if you can tell us more about your lockups.  Have
you tried sysrq, NMI watchdog, kdb or kgdb?

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Race condition in 2.4 tasklet handling (cli() broken?)
  2003-08-23 15:13       ` TeJun Huh
@ 2003-08-23 15:37         ` Stephan von Krawczynski
  2003-08-25  6:31           ` TeJun Huh
  2003-08-23 15:56         ` Stephan von Krawczynski
  1 sibling, 1 reply; 13+ messages in thread
From: Stephan von Krawczynski @ 2003-08-23 15:37 UTC (permalink / raw)
  To: TeJun Huh; +Cc: linux-kernel

On Sun, 24 Aug 2003 00:13:15 +0900
TeJun Huh <tejun@aratech.co.kr> wrote:

> On Sat, Aug 23, 2003 at 12:28:13PM +0200, Stephan von Krawczynski wrote:
> > 
> > If we follow your analysis and say it is broken, do you have a
> > suggestion/patch how to fix both? I am willing to try your proposals, as it
> > seems I am one of very few who really experience stability issues on SMP
> > with the current implementation.
> > 
> 
>  Hello, Stephan.
> [...]
>  It would be helpful if you can tell us more about your lockups.  Have
> you tried sysrq, NMI watchdog, kdb or kgdb?

Hello TeJun,

my first offer for a real bloat of information is my/the/all LKML thread with
"2.4.22-pre lockups" subject (warning: thread is _long_). I have given oopses
and quite an amount of details around what is happening. But up to this day
noone seemed to have reproduced anything. For me it looks pretty "simple" to
produce lockups. All installations are SuSE 8.2 but with standard kernel. My
primary hint for increasing the lockup probability is using reiserfs. Though it
does not seem directly related, because I can do lockups with only ext3, too,
there seems to be more action in areas that are simply hit by the problem more
often.

Up to now I have not been able to crash 2.4.20, but 2.4.21 and anything later
is definitely dying away.
To produce the problem I simply copy around GBs of data (around 100 GB) between
several disks and NFS to and from some clients (test system as nfs server).
Under normal circumstances I need around 2 days to crash the box. I set up
serial console to catch the oopses. They are all on different locations and
mostly related to some weird list corruptions, as an example (taken from
2.4.22-pre10):

Unable to handle kernel NULL pointer dereference at virtual address 00000006
c0144b14
*pde = 00000000
Oops: 0002
CPU:    1
EIP:    0010:[<c0144b14>]    Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010246
eax: 00000000   ebx: f0f66540   ecx: f0f66540   edx: 00000006
esi: f0f66540   edi: f0f66540   ebp: c2ce0350   esp: c345df24
ds: 0018   es: 0018   ss: 0018
Process kswapd (pid: 5, stackpage=c345d000)
Stack: c0147ddf f0f66540 00000000 c2ce0350 0001bcad c02eab68 c0139228 c2ce0350
       000001d0 00000200 000001d0 00000016 00000020 000001d0 00000020 00000006
       c01394b3 00000006 c345c000 c02eab68 000001d0 00000006 c02eab68 00000000
Call Trace:    [<c0147ddf>] [<c0139228>] [<c01394b3>] [<c013952e>] [<c013963c>]
  [<c01396c8>] [<c01397f8>] [<c0139760>] [<c0105000>] [<c010592e>] [<c0139760>]
Code: 89 02 c7 41 30 00 00 00 00 89 4c 24 04 e9 7a ff ff ff 8d 76


>>EIP; c0144b14 <__remove_from_queues+14/30>   <=====

>>ebx; f0f66540 <_end+30bbb320/3852ee40>
>>ecx; f0f66540 <_end+30bbb320/3852ee40>
>>esi; f0f66540 <_end+30bbb320/3852ee40>
>>edi; f0f66540 <_end+30bbb320/3852ee40>
>>ebp; c2ce0350 <_end+2935130/3852ee40>
>>esp; c345df24 <_end+30b2d04/3852ee40>

Trace; c0147ddf <try_to_free_buffers+7f/170>
Trace; c0139228 <shrink_cache+298/3b0>
Trace; c01394b3 <shrink_caches+63/a0>
Trace; c013952e <try_to_free_pages_zone+3e/60>
Trace; c013963c <kswapd_balance_pgdat+4c/b0>
Trace; c01396c8 <kswapd_balance+28/40>
Trace; c01397f8 <kswapd+98/c0>
Trace; c0139760 <kswapd+0/c0>
Trace; c0105000 <_stext+0/0>
Trace; c010592e <arch_kernel_thread+2e/40>
Trace; c0139760 <kswapd+0/c0>

Code;  c0144b14 <__remove_from_queues+14/30>
00000000 <_EIP>:
Code;  c0144b14 <__remove_from_queues+14/30>   <=====
   0:   89 02                     mov    %eax,(%edx)   <=====
Code;  c0144b16 <__remove_from_queues+16/30>
   2:   c7 41 30 00 00 00 00      movl   $0x0,0x30(%ecx)
Code;  c0144b1d <__remove_from_queues+1d/30>
   9:   89 4c 24 04               mov    %ecx,0x4(%esp,1)
Code;  c0144b21 <__remove_from_queues+21/30>
   d:   e9 7a ff ff ff            jmp    ffffff8c <_EIP+0xffffff8c>
Code;  c0144b26 <__remove_from_queues+26/30>
  12:   8d 76 00                  lea    0x0(%esi),%esi

Somehow related seems a problem with data integrity. If I backup large files to
tape (SDLT) and verify them afterwards (all done with simple tar-command) I get
verification errors most of the time. In about 100 GB of backuped data there
are around 1-4 of those. This test "works" pretty reliable, so in case you have
some equipment around you may try this.
Ah yes: of course all this only happens on SMP. Same kernel versions work
flawlessly on the same box using UP. This is why I entered your thread about
races in SMP...

Feel free to ask anything about the issue, I will try to help however possible.

Regards,
Stephan




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Race condition in 2.4 tasklet handling (cli() broken?)
  2003-08-23 15:13       ` TeJun Huh
  2003-08-23 15:37         ` Stephan von Krawczynski
@ 2003-08-23 15:56         ` Stephan von Krawczynski
  2003-08-23 16:36           ` TeJun Huh
  2003-08-24  3:27           ` TeJun Huh
  1 sibling, 2 replies; 13+ messages in thread
From: Stephan von Krawczynski @ 2003-08-23 15:56 UTC (permalink / raw)
  To: TeJun Huh; +Cc: linux-kernel

On Sun, 24 Aug 2003 00:13:15 +0900
TeJun Huh <tejun@aratech.co.kr> wrote:

>  Hello, Stephan.
> 
>  The race conditions I'm mentioning in this thread are not likely to
> cause real troubles.  The first one does not make any difference on
> x86, and AFAIK bh isn't used extensively anymore so the second one
> isn't very relevant either.  Only the race condition mentioned in the
> other thread is of relvance if there is any :-(.

Are you sure? bh is used in fs subtree to my knowledge ...

Regards,
Stephan


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Race condition in 2.4 tasklet handling (cli() broken?)
  2003-08-23 15:56         ` Stephan von Krawczynski
@ 2003-08-23 16:36           ` TeJun Huh
  2003-08-24  3:27           ` TeJun Huh
  1 sibling, 0 replies; 13+ messages in thread
From: TeJun Huh @ 2003-08-23 16:36 UTC (permalink / raw)
  To: Stephan von Krawczynski; +Cc: TeJun Huh, linux-kernel

On Sat, Aug 23, 2003 at 05:56:04PM +0200, Stephan von Krawczynski wrote:
> On Sun, 24 Aug 2003 00:13:15 +0900
> TeJun Huh <tejun@aratech.co.kr> wrote:
> 
> >  Hello, Stephan.
> > 
> >  The race conditions I'm mentioning in this thread are not likely to
> > cause real troubles.  The first one does not make any difference on
> > x86, and AFAIK bh isn't used extensively anymore so the second one
> > isn't very relevant either.  Only the race condition mentioned in the
> > other thread is of relvance if there is any :-(.
> 
> Are you sure? bh is used in fs subtree to my knowledge ...
> 

 Wow, I'm not sure.  Because our application is mostly concerned with
network and raw DISK I/O, I haven't been paying attention to fs codes.
If bh can be problematic, I'll try to rethink about the bh
synchronization and make a patch tomorrow, probably another one or two
liner.  First thing tomorrow.

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Race condition in 2.4 tasklet handling (cli() broken?)
  2003-08-23 15:56         ` Stephan von Krawczynski
  2003-08-23 16:36           ` TeJun Huh
@ 2003-08-24  3:27           ` TeJun Huh
  1 sibling, 0 replies; 13+ messages in thread
From: TeJun Huh @ 2003-08-24  3:27 UTC (permalink / raw)
  To: Stephan von Krawczynski; +Cc: linux-kernel, manfred, andrea

[-- Attachment #1: Type: text/plain, Size: 588 bytes --]

Hello, Stephan.

 I'm attaching a patch which makes the following changes.

1. adds smp_mb() to irq_enter().
2. adds smp_mb() to synchronize_irq().
3. makes get_irq_lock() grab global_bh_lock before returning.
4. makes release_irq_lock() release global_bh_lock.

 As I now found out that test_and_set_bit() implies memory barrier,
smp_mb__after_test_and_set_bit() stuff is removed.

 This patch should fix the two relevant race conditions mentioned in
this and the other threads.  Please test this one.  It's against the
latest 2.4 bk tree but applying to 2.4.21 should be ok.

-- 
tejun

[-- Attachment #2: patch.irqbh --]
[-- Type: text/plain, Size: 2199 bytes --]

# This is a BitKeeper generated patch for the following project:
# Project Name: Linux kernel tree
# This patch format is intended for GNU patch command version 2.5 or higher.
# This patch includes the following deltas:
#	           ChangeSet	1.1102  -> 1.1103 
#	arch/i386/kernel/irq.c	1.7     -> 1.8    
#	include/asm-i386/hardirq.h	1.4     -> 1.5    
#
# The following is the BitKeeper ChangeSet Log
# --------------------------------------------
# 03/08/24	tj@atj.dyndns.org	1.1103
# - irq/bh race fixes.
# --------------------------------------------
#
diff -Nru a/arch/i386/kernel/irq.c b/arch/i386/kernel/irq.c
--- a/arch/i386/kernel/irq.c	Sun Aug 24 12:18:01 2003
+++ b/arch/i386/kernel/irq.c	Sun Aug 24 12:18:01 2003
@@ -272,8 +272,7 @@
 		 * already executing in one..
 		 */
 		if (!irqs_running())
-			if (local_bh_count(cpu) || !spin_is_locked(&global_bh_lock))
-				break;
+			break;
 
 		/* Duh, we have to loop. Release the lock to avoid deadlocks */
 		clear_bit(0,&global_irq_lock);
@@ -307,6 +306,7 @@
  */
 void synchronize_irq(void)
 {
+	smp_mb(); /* Sync with irq_enter() */
 	if (irqs_running()) {
 		/* Stupid approach */
 		cli();
@@ -332,6 +332,10 @@
 	 * in an interrupt context. 
 	 */
 	wait_on_irq(cpu);
+
+	/* bh is disallowed inside irqlock. */
+	if (!local_bh_count(cpu))
+		spin_lock(&global_bh_lock);
 
 	/*
 	 * Ok, finally..
diff -Nru a/include/asm-i386/hardirq.h b/include/asm-i386/hardirq.h
--- a/include/asm-i386/hardirq.h	Sun Aug 24 12:18:01 2003
+++ b/include/asm-i386/hardirq.h	Sun Aug 24 12:18:01 2003
@@ -54,10 +54,15 @@
 	return 0;
 }
 
+extern spinlock_t global_bh_lock; /* copied from linux/interrupt.h to break
+				     include loop :-( */
+
 static inline void release_irqlock(int cpu)
 {
 	/* if we didn't own the irq lock, just ignore.. */
 	if (global_irq_holder == (unsigned char) cpu) {
+		if (!local_bh_count(cpu))
+			spin_unlock(&global_bh_lock);
 		global_irq_holder = NO_PROC_ID;
 		clear_bit(0,&global_irq_lock);
 	}
@@ -66,6 +71,8 @@
 static inline void irq_enter(int cpu, int irq)
 {
 	++local_irq_count(cpu);
+
+	smp_mb(); /* sync with wait_on_irq() and synchronize_irq() */
 
 	while (test_bit(0,&global_irq_lock)) {
 		cpu_relax();

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Race condition in 2.4 tasklet handling (cli() broken?)
  2003-08-23 15:37         ` Stephan von Krawczynski
@ 2003-08-25  6:31           ` TeJun Huh
  2003-08-25  7:23             ` Stephan von Krawczynski
  0 siblings, 1 reply; 13+ messages in thread
From: TeJun Huh @ 2003-08-25  6:31 UTC (permalink / raw)
  To: Stephan von Krawczynski; +Cc: linux-kernel

On Sat, Aug 23, 2003 at 05:37:36PM +0200, Stephan von Krawczynski wrote:
> 
> Feel free to ask anything about the issue, I will try to help however possible.

 A deadlock occured today on our test machine, and the offending part
was drivers/char/random.c:iplock.  The spinlock is used without _bh
suffix both by the connect and accept paths causing deadlock
occasionally.  This seemed to be already fixed in the latest 2.4 bk
tree.  I believe this will cure our problem. :-)

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Race condition in 2.4 tasklet handling (cli() broken?)
  2003-08-25  6:31           ` TeJun Huh
@ 2003-08-25  7:23             ` Stephan von Krawczynski
  2003-08-26  0:27               ` TeJun Huh
  0 siblings, 1 reply; 13+ messages in thread
From: Stephan von Krawczynski @ 2003-08-25  7:23 UTC (permalink / raw)
  To: TeJun Huh; +Cc: linux-kernel

On Mon, 25 Aug 2003 15:31:55 +0900
TeJun Huh <tejun@aratech.co.kr> wrote:

> On Sat, Aug 23, 2003 at 05:37:36PM +0200, Stephan von Krawczynski wrote:
> > 
> > Feel free to ask anything about the issue, I will try to help however possible.
> 
>  A deadlock occured today on our test machine, and the offending part
> was drivers/char/random.c:iplock.  The spinlock is used without _bh
> suffix both by the connect and accept paths causing deadlock
> occasionally.  This seemed to be already fixed in the latest 2.4 bk
> tree.  I believe this will cure our problem. :-)
> 
> -- 
> tejun

Hello TeJun,

I started testing your latest patch yesterday on top of rc3 and got the following:

ksymoops 2.4.8 on i686 2.4.22-rc3-huh.  Options used
     -V (default)
     -k /proc/ksyms (default)
     -l /proc/modules (default)
     -o /lib/modules/2.4.22-rc3-huh/ (default)
     -m /boot/System.map-2.4.22-rc3-huh (default)

Warning: You did not tell me where to find symbol information.  I will
assume that the log matches the kernel and modules that are running
right now and I'll use the default options above for symbol resolution.
If the current kernel and/or modules do not match the log, you can get
more accurate output by telling me the kernel version and where to find
map, modules, ksyms etc.  ksymoops -h explains the options.

Kernel panic: Ththththaats all folks.  Too dangerous to continue.
NMI Watchdog detected LOCKUP on CPU0, eip c01db130, registers:
CPU:    0
EIP:    0010:[<c01db130>]    Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00000082
eax: 00000400   ebx: c3461640   ecx: 00000803   edx: f79bc000
esi: c1000020   edi: cccb0f38   ebp: 00000001   esp: f46fdbec
ds: 0018   es: 0018   ss: 0018
Process tar (pid: 9634, stackpage=f46fd000)
Stack: 0000007d 0009b220 9410d001 f0000000 00000001 00000003 c3461640 c0115f54
       00000400 00000200 00000000 00000000 c3461648 00000000 00000008 002a2480
       00000000 00000803 cccb0f38 00000008 00000001 c01da99e c3461618 00000001
Call Trace:    [<c0115f54>] [<c01da99e>] [<c01daa7e>] [<c0144a9c>] [<c0144bb0>]
  [<c01cbe24>] [<c01ce9b8>] [<c01d6590>] [<c011cc9e>] [<c0144be5>] [<c0144d2a>]
  [<c0144ee1>] [<c0144f7f>] [<c011c788>] [<c02174c1>] [<c021779d>] [<c01da4b0>]
  [<c01da99e>] [<c01daa7e>] [<c0146ce7>] [<c013aee2>] [<c0130dce>] [<c0195aa0>]
  [<c01314bf>] [<c01317a1>] [<c0131da0>] [<c013204c>] [<c0131da0>] [<c01432db>]
  [<c010782f>]
Code: f3 90 7e f5 e9 db f0 ff ff 80 3d b8 fb 30 c0 00 f3 90 7e f5


>>EIP; c01db130 <.text.lock.ll_rw_blk+57/77>   <=====

>>ebx; c3461640 <_end+3099fe0/38512a00>
>>edx; f79bc000 <_end+375f49a0/38512a00>
>>esi; c1000020 <_end+c389c0/38512a00>
>>edi; cccb0f38 <_end+c8e98d8/38512a00>
>>esp; f46fdbec <_end+3433658c/38512a00>

Trace; c0115f54 <smp_apic_timer_interrupt+134/160>
Trace; c01da99e <generic_make_request+be/140>
Trace; c01daa7e <submit_bh+5e/c0>
Trace; c0144a9c <write_locked_buffers+2c/40>
Trace; c0144bb0 <write_some_buffers+100/110>
Trace; c01cbe24 <lf+74/80>
Trace; c01ce9b8 <vt_console_print+248/370>
Trace; c01d6590 <serial_console_write+120/220>
Trace; c011cc9e <__call_console_drivers+4e/70>
Trace; c0144be5 <write_unlocked_buffers+25/30>
Trace; c0144d2a <sync_buffers+1a/80>
Trace; c0144ee1 <fsync_dev+21/a0>
Trace; c0144f7f <sys_sync+f/20>
Trace; c011c788 <panic+128/140>
Trace; c02174c1 <dump_stats+61/b0>
Trace; c021779d <scsi_back_merge_fn_c+7d/c0>
Trace; c01da4b0 <__make_request+380/7b0>
Trace; c01da99e <generic_make_request+be/140>
Trace; c01daa7e <submit_bh+5e/c0>
Trace; c0146ce7 <block_read_full_page+2d7/2f0>
Trace; c013aee2 <__alloc_pages+42/190>
Trace; c0130dce <page_cache_read+be/e0>
Trace; c0195aa0 <reiserfs_get_block+0/1490>
Trace; c01314bf <generic_file_readahead+af/1a0>
Trace; c01317a1 <do_generic_file_read+1c1/470>
Trace; c0131da0 <file_read_actor+0/110>
Trace; c013204c <generic_file_read+19c/1b0>
Trace; c0131da0 <file_read_actor+0/110>
Trace; c01432db <sys_read+9b/180>
Trace; c010782f <system_call+33/38>

Code;  c01db130 <.text.lock.ll_rw_blk+57/77>
00000000 <_EIP>:
Code;  c01db130 <.text.lock.ll_rw_blk+57/77>   <=====
   0:   f3 90                     repz nop    <=====
Code;  c01db132 <.text.lock.ll_rw_blk+59/77>
   2:   7e f5                     jle    fffffff9 <_EIP+0xfffffff9>
Code;  c01db134 <.text.lock.ll_rw_blk+5b/77>
   4:   e9 db f0 ff ff            jmp    fffff0e4 <_EIP+0xfffff0e4>
Code;  c01db139 <.text.lock.ll_rw_blk+60/77>
   9:   80 3d b8 fb 30 c0 00      cmpb   $0x0,0xc030fbb8
Code;  c01db140 <.text.lock.ll_rw_blk+67/77>
  10:   f3 90                     repz nop 
Code;  c01db142 <.text.lock.ll_rw_blk+69/77>
  12:   7e f5                     jle    9 <_EIP+0x9>


1 warning issued.  Results may not be reliable.

Can you comment this one?

Regards,
Stephan


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Race condition in 2.4 tasklet handling (cli() broken?)
  2003-08-25  7:23             ` Stephan von Krawczynski
@ 2003-08-26  0:27               ` TeJun Huh
  0 siblings, 0 replies; 13+ messages in thread
From: TeJun Huh @ 2003-08-26  0:27 UTC (permalink / raw)
  To: Stephan von Krawczynski; +Cc: linux-kernel

 Hello Stephan,

 I'm sorry but I don't have much idea.  That looks like a deadlock on
io_request_lock, which I think is very unlikely, or maybe another cpu
got io_request_lock and hung.  I've also found out that the bh race I
thought I found didn't exist due to the order of spin_trylock() and
hardirq_trylock() tests in bh_action().  So, only the first
irq_enter() related race is valid.  I've been trying to reproduce your
kernel hang condition with reiserfs and nfs from yesterday without
success.

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2003-08-26  0:25 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-08-23  2:54 Race condition in 2.4 tasklet handling TeJun Huh
2003-08-23  4:09 ` Race condition in 2.4 tasklet handling (cli() broken?) TeJun Huh
2003-08-23  5:26   ` TeJun Huh
2003-08-23 10:28     ` Stephan von Krawczynski
2003-08-23 15:13       ` TeJun Huh
2003-08-23 15:37         ` Stephan von Krawczynski
2003-08-25  6:31           ` TeJun Huh
2003-08-25  7:23             ` Stephan von Krawczynski
2003-08-26  0:27               ` TeJun Huh
2003-08-23 15:56         ` Stephan von Krawczynski
2003-08-23 16:36           ` TeJun Huh
2003-08-24  3:27           ` TeJun Huh
2003-08-23 15:04 ` Race condition in 2.4 tasklet handling Anton Blanchard

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).