* Trivial hard lockup, SCSI, 2.4.23
@ 2003-12-16 18:28 Ian Soboroff
2003-12-18 13:53 ` Marcelo Tosatti
0 siblings, 1 reply; 3+ messages in thread
From: Ian Soboroff @ 2003-12-16 18:28 UTC (permalink / raw)
To: linux-kernel
I've found that I can lock a machine running 2.4.23aa1 by trying to
access a nonexistent SCSI device. In other words, if a userspace
program tries to access /dev/sdd, but no device is attached on any
SCSI bus using that device node, the machine locks hard.
We found this when we disconnected a SCSI hardware RAID from a server,
but forgot to remove the cron job which checked its status.
The lockup leaves no errors whatsoever in the logs. I finally tracked
it down with the NMI watchdog.
Ian
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Trivial hard lockup, SCSI, 2.4.23
2003-12-16 18:28 Trivial hard lockup, SCSI, 2.4.23 Ian Soboroff
@ 2003-12-18 13:53 ` Marcelo Tosatti
2003-12-18 14:21 ` Ian Soboroff
0 siblings, 1 reply; 3+ messages in thread
From: Marcelo Tosatti @ 2003-12-18 13:53 UTC (permalink / raw)
To: Ian Soboroff; +Cc: linux-kernel
On Tue, 16 Dec 2003, Ian Soboroff wrote:
>
> I've found that I can lock a machine running 2.4.23aa1 by trying to
> access a nonexistent SCSI device. In other words, if a userspace
> program tries to access /dev/sdd, but no device is attached on any
> SCSI bus using that device node, the machine locks hard.
>
> We found this when we disconnected a SCSI hardware RAID from a server,
> but forgot to remove the cron job which checked its status.
>
> The lockup leaves no errors whatsoever in the logs. I finally tracked
> it down with the NMI watchdog.
What did the NMI oopser report ?
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Trivial hard lockup, SCSI, 2.4.23
2003-12-18 13:53 ` Marcelo Tosatti
@ 2003-12-18 14:21 ` Ian Soboroff
0 siblings, 0 replies; 3+ messages in thread
From: Ian Soboroff @ 2003-12-18 14:21 UTC (permalink / raw)
To: Marcelo Tosatti; +Cc: linux-kernel
Marcelo Tosatti <marcelo.tosatti@cyclades.com> writes:
> On Tue, 16 Dec 2003, Ian Soboroff wrote:
>
>>
>> I've found that I can lock a machine running 2.4.23aa1 by trying to
>> access a nonexistent SCSI device. In other words, if a userspace
>> program tries to access /dev/sdd, but no device is attached on any
>> SCSI bus using that device node, the machine locks hard.
>>
>> We found this when we disconnected a SCSI hardware RAID from a server,
>> but forgot to remove the cron job which checked its status.
>>
>> The lockup leaves no errors whatsoever in the logs. I finally tracked
>> it down with the NMI watchdog.
>
> What did the NMI oopser report ?
It didn't get logged, so I don't have the full trace, but I remember
that it indicated a program we run called raidm, which checks the
status of our RAIDs periodically. We'd forgotten to gun the instance
which was watching the RAID we disconnected.
Running raidm on a connected RAID, I can see that it opens the device
and sends some ioctls:
...
stat64("/dev/sda1", {st_mode=S_IFBLK|0660, st_rdev=makedev(8, 1), ...}) = 0
open("/dev/sda1", O_RDONLY) = 3
ioctl(3, FIBMAP, 0xbfff4840) = 0
ioctl(3, FIBMAP, 0xbfff4840) = 0
ioctl(3, FIBMAP, 0xbfff4910) = 134217730
ioctl(3, FIBMAP, 0xbfff4910) = 134217730
ioctl(3, FIBMAP, 0xbfff4910) = 134217730
ioctl(3, FIBMAP, 0xbfff4910) = 134217730
ioctl(3, FIBMAP, 0xbfff4910) = 134217730
ioctl(3, FIBMAP, 0xbfff4910) = 134217730
time(NULL) = 1071757056
open("/var/log/raidm.log", O_WRONLY|O_APPEND|O_CREAT, 0666) = 4
fstat64(4, {st_mode=S_IFREG|0644, st_size=5049303, ...}) = 0
...
I can't afford to kill the machine again, otherwise I'd trigger the
oops again. Shouldn't scsi or aic7xxx not let me open(2) the device
if nothing's attached?
Ian
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2003-12-18 14:21 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-12-16 18:28 Trivial hard lockup, SCSI, 2.4.23 Ian Soboroff
2003-12-18 13:53 ` Marcelo Tosatti
2003-12-18 14:21 ` Ian Soboroff
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).