linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Trivial hard lockup, SCSI, 2.4.23
@ 2003-12-16 18:28 Ian Soboroff
  2003-12-18 13:53 ` Marcelo Tosatti
  0 siblings, 1 reply; 3+ messages in thread
From: Ian Soboroff @ 2003-12-16 18:28 UTC (permalink / raw)
  To: linux-kernel


I've found that I can lock a machine running 2.4.23aa1 by trying to
access a nonexistent SCSI device.  In other words, if a userspace
program tries to access /dev/sdd, but no device is attached on any
SCSI bus using that device node, the machine locks hard.

We found this when we disconnected a SCSI hardware RAID from a server,
but forgot to remove the cron job which checked its status.

The lockup leaves no errors whatsoever in the logs.  I finally tracked
it down with the NMI watchdog.

Ian



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Trivial hard lockup, SCSI, 2.4.23
  2003-12-16 18:28 Trivial hard lockup, SCSI, 2.4.23 Ian Soboroff
@ 2003-12-18 13:53 ` Marcelo Tosatti
  2003-12-18 14:21   ` Ian Soboroff
  0 siblings, 1 reply; 3+ messages in thread
From: Marcelo Tosatti @ 2003-12-18 13:53 UTC (permalink / raw)
  To: Ian Soboroff; +Cc: linux-kernel



On Tue, 16 Dec 2003, Ian Soboroff wrote:

> 
> I've found that I can lock a machine running 2.4.23aa1 by trying to
> access a nonexistent SCSI device.  In other words, if a userspace
> program tries to access /dev/sdd, but no device is attached on any
> SCSI bus using that device node, the machine locks hard.
> 
> We found this when we disconnected a SCSI hardware RAID from a server,
> but forgot to remove the cron job which checked its status.
> 
> The lockup leaves no errors whatsoever in the logs.  I finally tracked
> it down with the NMI watchdog.

What did the NMI oopser report ? 


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Trivial hard lockup, SCSI, 2.4.23
  2003-12-18 13:53 ` Marcelo Tosatti
@ 2003-12-18 14:21   ` Ian Soboroff
  0 siblings, 0 replies; 3+ messages in thread
From: Ian Soboroff @ 2003-12-18 14:21 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: linux-kernel

Marcelo Tosatti <marcelo.tosatti@cyclades.com> writes:

> On Tue, 16 Dec 2003, Ian Soboroff wrote:
>
>> 
>> I've found that I can lock a machine running 2.4.23aa1 by trying to
>> access a nonexistent SCSI device.  In other words, if a userspace
>> program tries to access /dev/sdd, but no device is attached on any
>> SCSI bus using that device node, the machine locks hard.
>> 
>> We found this when we disconnected a SCSI hardware RAID from a server,
>> but forgot to remove the cron job which checked its status.
>> 
>> The lockup leaves no errors whatsoever in the logs.  I finally tracked
>> it down with the NMI watchdog.
>
> What did the NMI oopser report ? 

It didn't get logged, so I don't have the full trace, but I remember
that it indicated a program we run called raidm, which checks the
status of our RAIDs periodically.  We'd forgotten to gun the instance
which was watching the RAID we disconnected.

Running raidm on a connected RAID, I can see that it opens the device
and sends some ioctls:

...
stat64("/dev/sda1", {st_mode=S_IFBLK|0660, st_rdev=makedev(8, 1), ...}) = 0
open("/dev/sda1", O_RDONLY)             = 3
ioctl(3, FIBMAP, 0xbfff4840)            = 0
ioctl(3, FIBMAP, 0xbfff4840)            = 0
ioctl(3, FIBMAP, 0xbfff4910)            = 134217730
ioctl(3, FIBMAP, 0xbfff4910)            = 134217730
ioctl(3, FIBMAP, 0xbfff4910)            = 134217730
ioctl(3, FIBMAP, 0xbfff4910)            = 134217730
ioctl(3, FIBMAP, 0xbfff4910)            = 134217730
ioctl(3, FIBMAP, 0xbfff4910)            = 134217730
time(NULL)                              = 1071757056
open("/var/log/raidm.log", O_WRONLY|O_APPEND|O_CREAT, 0666) = 4
fstat64(4, {st_mode=S_IFREG|0644, st_size=5049303, ...}) = 0
...

I can't afford to kill the machine again, otherwise I'd trigger the
oops again.  Shouldn't scsi or aic7xxx not let me open(2) the device
if nothing's attached?

Ian


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2003-12-18 14:21 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-12-16 18:28 Trivial hard lockup, SCSI, 2.4.23 Ian Soboroff
2003-12-18 13:53 ` Marcelo Tosatti
2003-12-18 14:21   ` Ian Soboroff

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).