linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* very strange (semi-)lockups in 2.4.5
@ 2001-06-18  7:05 Pozsar Balazs
  2001-06-18 17:10 ` Albert D. Cahalan
  2001-06-19 13:09 ` Pavel Machek
  0 siblings, 2 replies; 7+ messages in thread
From: Pozsar Balazs @ 2001-06-18  7:05 UTC (permalink / raw)
  To: linux-kernel


Hi all.

I'm having ~2 lockups a day. The following happens:
 If I was under X, i only can use the magic-key, but no other keyboard (eg
numlock) or mouse response, the screen freezes, processes stop.
 If i was using textmode:
  numlock still works
  cursor blinks
  processess stop (eg, gpm doesn't work, outputs freeze)
  i can still switch vt's.
  BUT, i can only type into a few vt's, last time into 3,5,6,7,8, but not
into 1,2 or 4!

I cannot give you any traces, as i dont have any.

Also note that magic-key works, and it says that it umounts filesystems if
i press magic-u, but next time at mount i see that reiserfs is replaying
transactions.


Any ideas?

The machine is a P3-750, 512M ram, abit vp6 mb. No overclocking, and it
passes memtest86.


Balazs Pozsar.
-- 



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: very strange (semi-)lockups in 2.4.5
  2001-06-18  7:05 very strange (semi-)lockups in 2.4.5 Pozsar Balazs
@ 2001-06-18 17:10 ` Albert D. Cahalan
  2001-06-18 19:20   ` Pozsar Balazs
  2001-06-19 13:09 ` Pavel Machek
  1 sibling, 1 reply; 7+ messages in thread
From: Albert D. Cahalan @ 2001-06-18 17:10 UTC (permalink / raw)
  To: Pozsar Balazs; +Cc: linux-kernel

Pozsar Balazs writes:

> I'm having ~2 lockups a day. The following happens:
>  If I was under X, i only can use the magic-key, but no other keyboard (eg
> numlock) or mouse response, the screen freezes, processes stop.
>  If i was using textmode:
>   numlock still works
>   cursor blinks
>   processess stop (eg, gpm doesn't work, outputs freeze)
>   i can still switch vt's.
>   BUT, i can only type into a few vt's, last time into 3,5,6,7,8, but not
> into 1,2 or 4!
> 
> I cannot give you any traces, as i dont have any.
> 
> Also note that magic-key works, and it says that it umounts filesystems if
> i press magic-u, but next time at mount i see that reiserfs is replaying
> transactions.
> 
> 
> Any ideas?
> 
> The machine is a P3-750, 512M ram, abit vp6 mb. No overclocking, and it
> passes memtest86.

I think I'm getting the same thing, but I don't have the magic-key
compiled in. I'm going to hook up a VT510 to the serial port, in case
this is just XFree86 crashing. For anyone collecting statistics:

kernels 2.4.4-pre6 (?) and now 2.4.6-pre3
plain Pentium MMX @ 200 MHz
Intel motherboard -- see below
stable since 1996, on a UPS, dust-free, and the fan works
one lockup per day with desktop usage

In case the serial console doesn't work, could someone post plans
for a safe NMI board? (both ISA and PCI) The best I found:
http://www.sandelman.ottawa.on.ca/linux-ipsec/html/2000/02/msg00425.html
http://www.sandelman.ottawa.on.ca/linux-ipsec/html/2000/02/msg00391.html
(for PCI you're supposed to assert SERR# on the clock -- how?)

00:00.0 Host bridge: Intel Corporation 430TX - 82439TX MTXC (rev 01)
00:07.0 ISA bridge: Intel Corporation 82371AB PIIX4 ISA (rev 01)
00:07.1 IDE interface: Intel Corporation 82371AB PIIX4 IDE (rev 01)
00:07.2 USB Controller: Intel Corporation 82371AB PIIX4 USB (rev 01)
00:07.3 Bridge: Intel Corporation 82371AB PIIX4 ACPI (rev 01)
00:11.0 Ethernet controller: Digital Equipment Corporation DECchip 21040 [Tulip] (rev 23)
00:13.0 Ethernet controller: Lite-On Communications Inc LNE100TX Fast Ethernet Adapter (rev 25)
00:14.0 VGA compatible controller: ATI Technologies Inc 3D Rage Pro 215GP (rev 5c)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: very strange (semi-)lockups in 2.4.5
  2001-06-18 17:10 ` Albert D. Cahalan
@ 2001-06-18 19:20   ` Pozsar Balazs
  2001-06-18 20:45     ` george anzinger
  0 siblings, 1 reply; 7+ messages in thread
From: Pozsar Balazs @ 2001-06-18 19:20 UTC (permalink / raw)
  To: Albert D. Cahalan; +Cc: Pozsar Balazs, linux-kernel


First I thought this was an X-issue, but now I'm 100% sure that it isn't,
as I met the desscribed hangup while working on the console too.

The NMI card would be interesting, if anyone tells me how to make one, and
how to patch the kernel to show useable information i'm looking forward to
do it, and send reports.


regards
Balazs Pozsar.

On Mon, 18 Jun 2001, Albert D. Cahalan wrote:

> Pozsar Balazs writes:
>
> > I'm having ~2 lockups a day. The following happens:
> >  If I was under X, i only can use the magic-key, but no other keyboard (eg
> > numlock) or mouse response, the screen freezes, processes stop.
> >  If i was using textmode:
> >   numlock still works
> >   cursor blinks
> >   processess stop (eg, gpm doesn't work, outputs freeze)
> >   i can still switch vt's.
> >   BUT, i can only type into a few vt's, last time into 3,5,6,7,8, but not
> > into 1,2 or 4!
> >
> > I cannot give you any traces, as i dont have any.
> >
> > Also note that magic-key works, and it says that it umounts filesystems if
> > i press magic-u, but next time at mount i see that reiserfs is replaying
> > transactions.
> >
> >
> > Any ideas?
> >
> > The machine is a P3-750, 512M ram, abit vp6 mb. No overclocking, and it
> > passes memtest86.
>
> I think I'm getting the same thing, but I don't have the magic-key
> compiled in. I'm going to hook up a VT510 to the serial port, in case
> this is just XFree86 crashing. For anyone collecting statistics:
>
> kernels 2.4.4-pre6 (?) and now 2.4.6-pre3
> plain Pentium MMX @ 200 MHz
> Intel motherboard -- see below
> stable since 1996, on a UPS, dust-free, and the fan works
> one lockup per day with desktop usage
>
> In case the serial console doesn't work, could someone post plans
> for a safe NMI board? (both ISA and PCI) The best I found:
> http://www.sandelman.ottawa.on.ca/linux-ipsec/html/2000/02/msg00425.html
> http://www.sandelman.ottawa.on.ca/linux-ipsec/html/2000/02/msg00391.html
> (for PCI you're supposed to assert SERR# on the clock -- how?)
>
> 00:00.0 Host bridge: Intel Corporation 430TX - 82439TX MTXC (rev 01)
> 00:07.0 ISA bridge: Intel Corporation 82371AB PIIX4 ISA (rev 01)
> 00:07.1 IDE interface: Intel Corporation 82371AB PIIX4 IDE (rev 01)
> 00:07.2 USB Controller: Intel Corporation 82371AB PIIX4 USB (rev 01)
> 00:07.3 Bridge: Intel Corporation 82371AB PIIX4 ACPI (rev 01)
> 00:11.0 Ethernet controller: Digital Equipment Corporation DECchip 21040 [Tulip] (rev 23)
> 00:13.0 Ethernet controller: Lite-On Communications Inc LNE100TX Fast Ethernet Adapter (rev 25)
> 00:14.0 VGA compatible controller: ATI Technologies Inc 3D Rage Pro 215GP (rev 5c)
>


-- 



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: very strange (semi-)lockups in 2.4.5
  2001-06-18 19:20   ` Pozsar Balazs
@ 2001-06-18 20:45     ` george anzinger
  2001-06-18 21:18       ` Albert D. Cahalan
  0 siblings, 1 reply; 7+ messages in thread
From: george anzinger @ 2001-06-18 20:45 UTC (permalink / raw)
  To: Pozsar Balazs; +Cc: Albert D. Cahalan, linux-kernel

Pozsar Balazs wrote:
> 
> First I thought this was an X-issue, but now I'm 100% sure that it isn't,
> as I met the desscribed hangup while working on the console too.
> 
> The NMI card would be interesting, if anyone tells me how to make one, and
> how to patch the kernel to show useable information i'm looking forward to
> do it, and send reports.
> 
Given that your system still handles interrupts:
a.) It would probably not trigger an NMI timer (the interrupts would
keep resetting it)
b.) Using KGDB will, most likely, be all you need anyway.

KGDB (find it a sourceforge) requires a second computer which is hooked
to the "subject" over a rs232 serial line.  The "host" computer needs to
have the vmlinux file as well as the sources for the kernel.  Once the
"subject" becomes ill, a control-C (from the "host") on the serial line
will interrupt it and start a dialog with gdb running on the "host"
computer.  If you have the second computer, this is the easiest way to
get to where you need to be.

If you have a complete freeze, then the NMI is useful, but even here, it
is best to let KGDB handle the NMI.  Much easier to see what's what than
looking thru an OOPS.

George


> regards
> Balazs Pozsar.
> 
> On Mon, 18 Jun 2001, Albert D. Cahalan wrote:
> 
> > Pozsar Balazs writes:
> >
> > > I'm having ~2 lockups a day. The following happens:
> > >  If I was under X, i only can use the magic-key, but no other keyboard (eg
> > > numlock) or mouse response, the screen freezes, processes stop.
> > >  If i was using textmode:
> > >   numlock still works
> > >   cursor blinks
> > >   processess stop (eg, gpm doesn't work, outputs freeze)
> > >   i can still switch vt's.
> > >   BUT, i can only type into a few vt's, last time into 3,5,6,7,8, but not
> > > into 1,2 or 4!
> > >
> > > I cannot give you any traces, as i dont have any.
> > >
> > > Also note that magic-key works, and it says that it umounts filesystems if
> > > i press magic-u, but next time at mount i see that reiserfs is replaying
> > > transactions.
> > >
> > >
> > > Any ideas?
> > >
> > > The machine is a P3-750, 512M ram, abit vp6 mb. No overclocking, and it
> > > passes memtest86.
> >
> > I think I'm getting the same thing, but I don't have the magic-key
> > compiled in. I'm going to hook up a VT510 to the serial port, in case
> > this is just XFree86 crashing. For anyone collecting statistics:
> >
> > kernels 2.4.4-pre6 (?) and now 2.4.6-pre3
> > plain Pentium MMX @ 200 MHz
> > Intel motherboard -- see below
> > stable since 1996, on a UPS, dust-free, and the fan works
> > one lockup per day with desktop usage
> >
> > In case the serial console doesn't work, could someone post plans
> > for a safe NMI board? (both ISA and PCI) The best I found:
> > http://www.sandelman.ottawa.on.ca/linux-ipsec/html/2000/02/msg00425.html
> > http://www.sandelman.ottawa.on.ca/linux-ipsec/html/2000/02/msg00391.html
> > (for PCI you're supposed to assert SERR# on the clock -- how?)
> >
> > 00:00.0 Host bridge: Intel Corporation 430TX - 82439TX MTXC (rev 01)
> > 00:07.0 ISA bridge: Intel Corporation 82371AB PIIX4 ISA (rev 01)
> > 00:07.1 IDE interface: Intel Corporation 82371AB PIIX4 IDE (rev 01)
> > 00:07.2 USB Controller: Intel Corporation 82371AB PIIX4 USB (rev 01)
> > 00:07.3 Bridge: Intel Corporation 82371AB PIIX4 ACPI (rev 01)
> > 00:11.0 Ethernet controller: Digital Equipment Corporation DECchip 21040 [Tulip] (rev 23)
> > 00:13.0 Ethernet controller: Lite-On Communications Inc LNE100TX Fast Ethernet Adapter (rev 25)
> > 00:14.0 VGA compatible controller: ATI Technologies Inc 3D Rage Pro 215GP (rev 5c)
> >
> 
> --
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: very strange (semi-)lockups in 2.4.5
  2001-06-18 20:45     ` george anzinger
@ 2001-06-18 21:18       ` Albert D. Cahalan
  2001-06-19  0:38         ` george anzinger
  0 siblings, 1 reply; 7+ messages in thread
From: Albert D. Cahalan @ 2001-06-18 21:18 UTC (permalink / raw)
  To: george anzinger; +Cc: Pozsar Balazs, Albert D. Cahalan, linux-kernel

george anzinger writes:
> Pozsar Balazs wrote:

>> The NMI card would be interesting, if anyone tells me how to make
>> one, and how to patch the kernel to show useable information i'm
>> looking forward to do it, and send reports.
>
> Given that your system still handles interrupts:
> a.) It would probably not trigger an NMI timer (the interrupts would
> keep resetting it)

Huh? No, this isn't the NMI timer. It's an NMI you generate
with a pushbutton on the back of your PC. My computer doesn't
have the APIC hardware needed for an NMI timer anyway.

For a PCI card, one must assert the SERR# signal. This is supposed
to be done for 1 clock cycle, on the proper clock edge. Going a bit
beyond 1 clock cycle ought to be OK, but my hand on a button is
likely to assert SERR# for millions of clock cycles. I've no idea
if my motherboard will handle that well.

> b.) Using KGDB will, most likely, be all you need anyway.

I'd rather just get an oops, but even still the board would
be good to have. KGDB can be triggered by an NMI, right?

Building an NMI board would be fun, overkill or not. :-)

> If you have a complete freeze, then the NMI is useful, but even
> here, it is best to let KGDB handle the NMI.  Much easier to see
> what's what than looking thru an OOPS.

I don't have an APIC. I have a plain Pentium MMX. If I want
an NMI it's going to come from a pushbutton.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: very strange (semi-)lockups in 2.4.5
  2001-06-18 21:18       ` Albert D. Cahalan
@ 2001-06-19  0:38         ` george anzinger
  0 siblings, 0 replies; 7+ messages in thread
From: george anzinger @ 2001-06-19  0:38 UTC (permalink / raw)
  To: Albert D. Cahalan; +Cc: Pozsar Balazs, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2582 bytes --]

"Albert D. Cahalan" wrote:
> 
> george anzinger writes:
> > Pozsar Balazs wrote:
> 
> >> The NMI card would be interesting, if anyone tells me how to make
> >> one, and how to patch the kernel to show useable information i'm
> >> looking forward to do it, and send reports.
> >
> > Given that your system still handles interrupts:
> > a.) It would probably not trigger an NMI timer (the interrupts would
> > keep resetting it)
> 
> Huh? No, this isn't the NMI timer. It's an NMI you generate
> with a pushbutton on the back of your PC. My computer doesn't
> have the APIC hardware needed for an NMI timer anyway.

Actually all that is needed is an 8254 PIT.  This is the same chip that
is used to generate the system clock (timer 0) and tones (timer 2).  As
I understand it timer 1 is used for memory refresh.  Each x86 PC has at
least one of these chips and each chip has three timers all clocked from
the same clock source.  But, and this is the point/ question, some PCs
have a second chip wired to NMI.  At least that is what my "The
Indispensable PC hardware book" says.

If your cpu has such a chip, it would be rather easy to program it to do
what the APIC NMI does.  According to the above book, the second PIT
should be at port 048h (counter 0) with the control port at 04bh.

But, actually my point is that, if your system is handling interrupts,
it is not so dead as to need and NMI.  The control-C from the serial
card should, for example, get it to do something useful (you know, like
spill its guts).
> 
> For a PCI card, one must assert the SERR# signal. This is supposed
> to be done for 1 clock cycle, on the proper clock edge. Going a bit
> beyond 1 clock cycle ought to be OK, but my hand on a button is
> likely to assert SERR# for millions of clock cycles. I've no idea
> if my motherboard will handle that well.

If you really want to build the push button, take a look at the
attached.  This is a message file that comes with KGDB....
> 
> > b.) Using KGDB will, most likely, be all you need anyway.
> 
> I'd rather just get an oops, but even still the board would
> be good to have. KGDB can be triggered by an NMI, right?

Usually the KGDB patch is set to catch NMI as well as any other kernel
fault.
> 
> Building an NMI board would be fun, overkill or not. :-)
> 
> > If you have a complete freeze, then the NMI is useful, but even
> > here, it is best to let KGDB handle the NMI.  Much easier to see
> > what's what than looking thru an OOPS.
> 
> I don't have an APIC. I have a plain Pentium MMX. If I want
> an NMI it's going to come from a pushbutton.

[-- Attachment #2: debug-nmi.txt --]
[-- Type: text/plain, Size: 1499 bytes --]

Subject: Debugging with NMI
Date: Mon, 12 Jul 1999 11:28:31 -0500
From: David Grothe <dave@gcom.com>
Organization: Gcom, Inc
To: David Grothe <dave@gcom.com>

Kernel hackers:

Maybe this is old hat, but it is new to me --

On an ISA bus machine, if you short out the A1 and B1 pins of an ISA
slot you will generate an NMI to the CPU.  This interrupts even a
machine that is hung in a loop with interrupts disabled.  Used in
conjunction with kgdb <
ftp://ftp.gcom.com/pub/linux/src/kgdb-2.3.35/kgdb-2.3.35.tgz > you can
gain debugger control of a machine that is hung in the kernel!  Even
without kgdb the kernel will print a stack trace so you can find out
where it was hung.

The A1/B1 pins are directly opposite one another and the farthest pins
towards the bracket end of the ISA bus socket.  You can stick a paper
clip or multi-meter probe between them to short them out.

I had a spare ISA bus to PC104 bus adapter around.  The PC104 end of the
board consists of two rows of wire wrap pins.  So I wired a push button
between the A1/B1 pins and now have an ISA board that I can stick into
any ISA bus slot for debugger entry.

Microsoft has a circuit diagram of a PCI card at
http://www.microsoft.com/hwdev/DEBUGGING/DMPSW.HTM.  If you want to
build one you will have to mail them and ask for the PAL equations.
Nobody makes one comercially.

[THIS TIP COMES WITH NO WARRANTY WHATSOEVER.  It works for me, but if
your machine catches fire, it is your problem, not mine.]

-- Dave (the kgdb guy)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: very strange (semi-)lockups in 2.4.5
  2001-06-18  7:05 very strange (semi-)lockups in 2.4.5 Pozsar Balazs
  2001-06-18 17:10 ` Albert D. Cahalan
@ 2001-06-19 13:09 ` Pavel Machek
  1 sibling, 0 replies; 7+ messages in thread
From: Pavel Machek @ 2001-06-19 13:09 UTC (permalink / raw)
  To: Pozsar Balazs; +Cc: linux-kernel

Hi!

> I'm having ~2 lockups a day. The following happens:
>  If I was under X, i only can use the magic-key, but no other keyboard (eg
> numlock) or mouse response, the screen freezes, processes stop.
>  If i was using textmode:
>   numlock still works
>   cursor blinks
>   processess stop (eg, gpm doesn't work, outputs freeze)
>   i can still switch vt's.
>   BUT, i can only type into a few vt's, last time into 3,5,6,7,8, but not
> into 1,2 or 4!
> 
> I cannot give you any traces, as i dont have any.
> 
> Also note that magic-key works, and it says that it umounts filesystems if
> i press magic-u, but next time at mount i see that reiserfs is replaying
> transactions.

I've seen something very similar yesterday, 2.4.5, PII/300, 64MB. 
MAGIC-s,u,b and ext2 came up clean.

-- 
Philips Velo 1: 1"x4"x8", 300gram, 60, 12MB, 40bogomips, linux, mutt,
details at http://atrey.karlin.mff.cuni.cz/~pavel/velo/index.html.


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2001-07-08 16:58 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-06-18  7:05 very strange (semi-)lockups in 2.4.5 Pozsar Balazs
2001-06-18 17:10 ` Albert D. Cahalan
2001-06-18 19:20   ` Pozsar Balazs
2001-06-18 20:45     ` george anzinger
2001-06-18 21:18       ` Albert D. Cahalan
2001-06-19  0:38         ` george anzinger
2001-06-19 13:09 ` Pavel Machek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).