Re: lockup on Athlon systems, kernel race condition?

* Re: lockup on Athlon systems, kernel race condition?
@ 2002-09-03 21:46 Manfred Spraul
  2002-09-03 22:04 ` Terence Ripperda
  0 siblings, 1 reply; 10+ messages in thread
From: Manfred Spraul @ 2002-09-03 21:46 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Terence Ripperda, linux-kernel

> Terence Ripperda wrote:
>> 
>> ...
>>
>> asmlinkage long sys_ioctl(unsigned int fd, unsigned int cmd, unsigned long arg)
>> {
>>         struct file * filp;
>>         unsigned int flag;
>>         int on, error = -EBADF;
>> 
>>         filp = fget(fd);
>>         if (!filp)
>>                 goto out;
>>         error = 0;
>>         lock_kernel();    <====
Which compiler to you use, and which kernel? Which additional patches?

With my 2.4.20-pre4-ac1 kernel, the lock_kernel is at offset +3a, 
according to your dump it's at +6a.

>>         switch (cmd) {
> 
> This CPU is spinning, waiting for kernel_flag.  It will take the IPI
> and the other CPU's smp_call_function() will succeed.
> 
> Possibly the IPI has got lost - seems that this is a popular failure mode
> for flakey chipsets/motherboards.
> 
> Or someone has called sys_ioctl() with interrupts disabled.  That's very
> doubtful.

Is it possible to display the cpu registers with kdb? Could you check 
that the interrupts are enabled?

I'd add a quick test into sys_ioctl() or lock_kernel: save_flags, and 
check that bit 9 is always enabled. Check __global_cli for sample code.
The X server probably runs with enough priveledges to disable the 
interrupts, perhaps it's doing something stupid.

--
	Manfred

^ permalink raw reply	[flat|nested] 10+ messages in thread