It's a similar to race condition spotted in i386 interrupt code. The race exists between tasklet_[hi_]action() and tasklet_disable(). Again, memory-ordered synchronization is used between tasklet_struct.count and tasklet_struct.state. tasklet_disable() is find because there's an smp_mb() at the end of tasklet_disable_nosync(); however, in tasklet_action(), there is no mb() between tasklet_trylock(t) and atomic_read(&t->count). This won't cause any trouble on architectures which orders memory accesses around atomic operations such (including x86), but on architectures which don't, a tasklet can be executing on another cpu on return from tasklet_disable(). Adding smp_mb__after_test_and_set_bit() at the end of tasklet_trylock() should remedy the situation. As smp_mb__{before|after}_test_and_set_bit() don't exist yet, I'm attaching a patch which adds smp_mb__after_clear_bit(). The patch is against 2.4.21. P.S. Please comment on the addition of smp_mb__{before|after}_test_and_set_bit(). P.P.S. One thing I don't really understand is the uses of smp_mb() at the end of tasklet_disable() and smp_mb__before_atomic_dec() inside tasklet_enable(). Can anybody tell me what those are for? -- tejun