linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Possible race condition in i386 global_irq_lock handling.
@ 2003-08-21 17:01 Manfred Spraul
  2003-08-21 17:27 ` Andrea Arcangeli
  0 siblings, 1 reply; 12+ messages in thread
From: Manfred Spraul @ 2003-08-21 17:01 UTC (permalink / raw)
  To: TeJun Huh; +Cc: linux-kernel, Zwane Mwaikambo

TeJun wrote:
> static inline void irq_enter(int cpu, int irq)
> {
> 	++local_irq_count(cpu);
> 
> 	while (test_bit(0,&global_irq_lock)) {
> 		cpu_relax();
> 	}
> }
> 
>  Is it a race condition or am I getting it horribly wrong?  Thx in
> advance.

Yes, it's a race. Actually a variant of the race that lead to the introduction of set_current_state():

test_bit is a simple read instruction. i386 cpus are free to execute it early, i.e. they can execute it before the write part of "++local_irq_count(cpu)".

I think smp_rmb() is the right barrier - could you write a patch and send it to Marcelo?

--
	Manfred




^ permalink raw reply	[flat|nested] 12+ messages in thread
* Possible race condition in i386 global_irq_lock handling.
@ 2003-08-21  8:48 TeJun Huh
  2003-08-21 10:07 ` Zwane Mwaikambo
  0 siblings, 1 reply; 12+ messages in thread
From: TeJun Huh @ 2003-08-21  8:48 UTC (permalink / raw)
  To: linux-kernel

 I've been reading i386 interrupt handling code for a couple of days
and encountered something that looks like a race condition.  It's
between include/asm-i386/hardirq.h:irq_enter() and
arch/i386/kernel/irq.c:get_irqlock().  They seem to be using lockless
synchronization with local_irq_count of each cpu and global_irq_lock
variable.

 A. locking CPU

 1. Do test_and_set_bit() on global_irq_lock, if fail, repeat.
 2. If all local_irq_count's are zero, we're the winner.  Check other
    stuff; otherwise, clear global_irq_lock and retry.

 B. other CPUs

 1. Increment local_irq_count
 2. test_bit() on global_irq_lock, if zero, continue handling interrupt;
    otherwise, wait till it's cleared.

 For this to work, the locking CPU should fetch the value of
local_irq_count after global_irq_lock value becomes visible to other
CPUs, and other CPUs should fetch the value of global_irq_lock after
making the incremented local_irq_count visible to other CPUs.

 The locking CPU is OK because test_and_set_bit() forces ordering on
x86, but there should be a mb() betweewn step 1 and 2 for other CPUs
because none of ++ and test_bit is ordering.  The B part is irq_enter()
in hardirq.h which looks like the following.

static inline void irq_enter(int cpu, int irq)
{
	++local_irq_count(cpu);

	while (test_bit(0,&global_irq_lock)) {
		cpu_relax();
	}
}

 Is it a race condition or am I getting it horribly wrong?  Thx in
advance.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2003-08-24 22:03 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-08-21 17:01 Possible race condition in i386 global_irq_lock handling Manfred Spraul
2003-08-21 17:27 ` Andrea Arcangeli
2003-08-21 21:48   ` Stephan von Krawczynski
2003-08-21 22:44     ` Andrea Arcangeli
2003-08-22  1:18     ` TeJun Huh
2003-08-22 10:07       ` Stephan von Krawczynski
2003-08-22 16:25       ` Andrea Arcangeli
2003-08-24  3:06         ` TeJun Huh
2003-08-24 22:03           ` Andrea Arcangeli
  -- strict thread matches above, loose matches on Subject: below --
2003-08-21  8:48 TeJun Huh
2003-08-21 10:07 ` Zwane Mwaikambo
2003-08-21 16:15   ` TeJun Huh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).