linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* buffer_head slab memory leak, Linux bug?
@ 2001-09-02 11:01 Elisheva Alexander
  2001-09-02 11:09 ` SMP: stuck on TLB IPI wait (was Re: buffer_head slab memory leak, Linux bug?) Elisheva Alexander
  2001-09-02 12:46 ` buffer_head slab memory leak, Linux bug? Alan Cox
  0 siblings, 2 replies; 3+ messages in thread
From: Elisheva Alexander @ 2001-09-02 11:01 UTC (permalink / raw)
  To: linux-kernel

Dear kernel list,

if anyone can send me some pointers or hints on how to tackle this bug i 
will be very happy.

on an SMP machine i get:
"stuck on TLB IPI wait (CPU#1)"
the driver that i am debugging uses a spin lock, and sometimes we take the
lock for a pretty long time.
this happens during heavy load, which is why i think that the problem
is that in smp_flush_tlb() in ./arch/i386/kernel/smp.c, one of the CPUs gets 
all upset that the other CPU is stuck in the lock for too long, and releases 
it before it was ment to be released.

things i did that didn't help:

a patch that fixed a similar problem in reiserfs
(http://www.geocrawler.com/mail/msg.php3?msg_id=3962182&list=3455)
the patch for the fast pentium problem, since i have a pentium III.
(http://www.ultraviolet.org/mail-archives/reiserfs.2000/6201.html)

i put a breakpoint when this occurs using kGDB, but i am not able to get 
the registers (and stack) of the CPU that is stuck, only the one that 
prints the message. so i don't really know where this occurs
in our own code. 
does anyone know how i may extract the stack of the second CPU at the 
time of this error?

I am using an Intel pentium III with dual CPU.
I am debugging check point's firewall and vpn modules, with kernel-2.2.14 
from the redhat RPM, but this also happens with the latest 2.2.19.

it happens quite often (at random), so it's not too hard to recreate it.

thanks a lot.

(please CC me, as i am not subscribed to the list.)

-- 
 Elisheva Alexander                          Software Developer
================================================================
 Email from people at checkpoint.com does not usually represent 
 official policy of Check Point (TM) Software Technologies Ltd.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* SMP: stuck on TLB IPI wait (was Re: buffer_head slab memory leak, Linux bug?)
  2001-09-02 11:01 buffer_head slab memory leak, Linux bug? Elisheva Alexander
@ 2001-09-02 11:09 ` Elisheva Alexander
  2001-09-02 12:46 ` buffer_head slab memory leak, Linux bug? Alan Cox
  1 sibling, 0 replies; 3+ messages in thread
From: Elisheva Alexander @ 2001-09-02 11:09 UTC (permalink / raw)
  To: linux-kernel

Please ignore the silly subject of my previous post, it's not related to
the message itself.

-- 
 Elisheva Alexander                          Software Developer
================================================================
 Email from people at checkpoint.com does not usually represent 
 official policy of Check Point (TM) Software Technologies Ltd.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: buffer_head slab memory leak, Linux bug?
  2001-09-02 11:01 buffer_head slab memory leak, Linux bug? Elisheva Alexander
  2001-09-02 11:09 ` SMP: stuck on TLB IPI wait (was Re: buffer_head slab memory leak, Linux bug?) Elisheva Alexander
@ 2001-09-02 12:46 ` Alan Cox
  1 sibling, 0 replies; 3+ messages in thread
From: Alan Cox @ 2001-09-02 12:46 UTC (permalink / raw)
  To: eli7; +Cc: linux-kernel

> on an SMP machine i get:
> "stuck on TLB IPI wait (CPU#1)"
> the driver that i am debugging uses a spin lock, and sometimes we take the
> lock for a pretty long time.

Basically you can't hold a spinlock too long or the kernel wil conclude the
other processor has hung. Anything which is going to take a spinlock long
enough to trigger that event is so non-scalable its not funny

It could also be that you have a locking error and are leaving the lock
held in some obscure case - and genuinely deadlocking the box.


> this happens during heavy load, which is why i think that the problem
> is that in smp_flush_tlb() in ./arch/i386/kernel/smp.c, one of the CPUs gets 
> all upset that the other CPU is stuck in the lock for too long, and releases 
> it before it was ment to be released.

Probably

Alan

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2001-09-02 12:43 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-09-02 11:01 buffer_head slab memory leak, Linux bug? Elisheva Alexander
2001-09-02 11:09 ` SMP: stuck on TLB IPI wait (was Re: buffer_head slab memory leak, Linux bug?) Elisheva Alexander
2001-09-02 12:46 ` buffer_head slab memory leak, Linux bug? Alan Cox

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).