linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* nmi errors?
@ 2003-09-03 21:20 Robert L. Harris
  2003-09-03 21:25 ` Richard B. Johnson
  0 siblings, 1 reply; 6+ messages in thread
From: Robert L. Harris @ 2003-09-03 21:20 UTC (permalink / raw)
  To: Linux-Kernel

[-- Attachment #1: Type: text/plain, Size: 1295 bytes --]



Can anyone tell me what this is?

16:00:09 mailserver kernel: Uhhuh. NMI received for unknown reason 31.
16:00:09 mailserver kernel: Dazed and confused, but trying to continue
16:00:09 mailserver kernel: Do you have a strange power saving mode enabled?
16:00:34 mailserver kernel: Uhhuh. NMI received for unknown reason 21.
16:00:34 mailserver kernel: Dazed and confused, but trying to continue

A coworker put a script on a server which loads up quite afew arrays
with pre-set values and then compares the values against arrays.  As soon as he 
kicked off the script I got alot of these in my log files.  Not much longer and the 
machine crashed hard.

Quad proc P3-550
16Gigs of RAM
Kernel: 2.4.22-rc2-ac3

CONFIG_HIGHMEM64G=y
CONFIG_HIGHMEM=y

Anyone have any thoughts or know what this means?  Do I have a HIGHMEM
problem?

:wq!
---------------------------------------------------------------------------
Robert L. Harris                     | GPG Key ID: E344DA3B
                                         @ x-hkp://pgp.mit.edu
DISCLAIMER:
      These are MY OPINIONS ALONE.  I speak for no-one else.

Life is not a destination, it's a journey.
  Microsoft produces 15 car pileups on the highway.
    Don't stop traffic to stand and gawk at the tragedy.

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: nmi errors?
  2003-09-03 21:20 nmi errors? Robert L. Harris
@ 2003-09-03 21:25 ` Richard B. Johnson
  2003-09-03 21:34   ` Robert L. Harris
  0 siblings, 1 reply; 6+ messages in thread
From: Richard B. Johnson @ 2003-09-03 21:25 UTC (permalink / raw)
  To: Robert L. Harris; +Cc: Linux-Kernel

On Wed, 3 Sep 2003, Robert L. Harris wrote:

>
>
> Can anyone tell me what this is?
>
> 16:00:09 mailserver kernel: Uhhuh. NMI received for unknown reason 31.
> 16:00:09 mailserver kernel: Dazed and confused, but trying to continue
> 16:00:09 mailserver kernel: Do you have a strange power saving mode enabled?
> 16:00:34 mailserver kernel: Uhhuh. NMI received for unknown reason 21.
> 16:00:34 mailserver kernel: Dazed and confused, but trying to continue
>
> A coworker put a script on a server which loads up quite afew arrays
> with pre-set values and then compares the values against arrays.  As soon as he
> kicked off the script I got alot of these in my log files.  Not much longer and the
> machine crashed hard.
>

Possible bad RAM.

Cheers,
Dick Johnson
Penguin : Linux version 2.4.22 on an i686 machine (794.73 BogoMips).
            Note 96.31% of all statistics are fiction.



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: nmi errors?
  2003-09-03 21:25 ` Richard B. Johnson
@ 2003-09-03 21:34   ` Robert L. Harris
  2003-09-04 12:18     ` Richard B. Johnson
  2003-09-04 15:11     ` Martin Schlemmer
  0 siblings, 2 replies; 6+ messages in thread
From: Robert L. Harris @ 2003-09-03 21:34 UTC (permalink / raw)
  To: Richard B. Johnson; +Cc: Linux-Kernel

[-- Attachment #1: Type: text/plain, Size: 1607 bytes --]



We ran "memtest" on the machine over the weekend and it completed 3
times without any problems.  Know a better or different test?


Thus spake Richard B. Johnson (root@chaos.analogic.com):

> On Wed, 3 Sep 2003, Robert L. Harris wrote:
> 
> >
> >
> > Can anyone tell me what this is?
> >
> > 16:00:09 mailserver kernel: Uhhuh. NMI received for unknown reason 31.
> > 16:00:09 mailserver kernel: Dazed and confused, but trying to continue
> > 16:00:09 mailserver kernel: Do you have a strange power saving mode enabled?
> > 16:00:34 mailserver kernel: Uhhuh. NMI received for unknown reason 21.
> > 16:00:34 mailserver kernel: Dazed and confused, but trying to continue
> >
> > A coworker put a script on a server which loads up quite afew arrays
> > with pre-set values and then compares the values against arrays.  As soon as he
> > kicked off the script I got alot of these in my log files.  Not much longer and the
> > machine crashed hard.
> >
> 
> Possible bad RAM.
> 
> Cheers,
> Dick Johnson
> Penguin : Linux version 2.4.22 on an i686 machine (794.73 BogoMips).
>             Note 96.31% of all statistics are fiction.
> 

:wq!
---------------------------------------------------------------------------
Robert L. Harris                     | GPG Key ID: E344DA3B
                                         @ x-hkp://pgp.mit.edu
DISCLAIMER:
      These are MY OPINIONS ALONE.  I speak for no-one else.

Life is not a destination, it's a journey.
  Microsoft produces 15 car pileups on the highway.
    Don't stop traffic to stand and gawk at the tragedy.

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: nmi errors?
  2003-09-03 21:34   ` Robert L. Harris
@ 2003-09-04 12:18     ` Richard B. Johnson
  2003-09-04 15:11     ` Martin Schlemmer
  1 sibling, 0 replies; 6+ messages in thread
From: Richard B. Johnson @ 2003-09-04 12:18 UTC (permalink / raw)
  To: Robert L. Harris; +Cc: Linux-Kernel

On Wed, 3 Sep 2003, Robert L. Harris wrote:

>
>
> We ran "memtest" on the machine over the weekend and it completed 3
> times without any problems.  Know a better or different test?
>
>

Write 0x80 out port 0x70, and hope nobody accesses the RTC. This
will (should) disable the NMI line. Then see if the error messages
go away. If they do, it's a real NMI and you really do have bad
RAM somewhere. If they don't, your motherboard is getting glitched
either by bad design or something plugged into a slot that doesn't
have the correct timing specs.

If everything works, in spite of the NMI, just comment out the
kernel printk() and cross your fingers.

> Thus spake Richard B. Johnson (root@chaos.analogic.com):
>
> > On Wed, 3 Sep 2003, Robert L. Harris wrote:
> >
> > >
> > >
> > > Can anyone tell me what this is?
> > >
> > > 16:00:09 mailserver kernel: Uhhuh. NMI received for unknown reason 31.
> > > 16:00:09 mailserver kernel: Dazed and confused, but trying to continue
> > > 16:00:09 mailserver kernel: Do you have a strange power saving mode enabled?
> > > 16:00:34 mailserver kernel: Uhhuh. NMI received for unknown reason 21.
> > > 16:00:34 mailserver kernel: Dazed and confused, but trying to continue
> > >
> > > A coworker put a script on a server which loads up quite afew arrays
> > > with pre-set values and then compares the values against arrays.  As soon as he
> > > kicked off the script I got alot of these in my log files.  Not much longer and the
> > > machine crashed hard.
> > >
> >
> > Possible bad RAM.
> >
> > Cheers,
> > Dick Johnson
> > Penguin : Linux version 2.4.22 on an i686 machine (794.73 BogoMips).
> >             Note 96.31% of all statistics are fiction.
> >
>
> :wq!
> ---------------------------------------------------------------------------
> Robert L. Harris                     | GPG Key ID: E344DA3B
>                                          @ x-hkp://pgp.mit.edu
> DISCLAIMER:
>       These are MY OPINIONS ALONE.  I speak for no-one else.
>
> Life is not a destination, it's a journey.
>   Microsoft produces 15 car pileups on the highway.
>     Don't stop traffic to stand and gawk at the tragedy.
>

Cheers,
Dick Johnson
Penguin : Linux version 2.4.22 on an i686 machine (794.73 BogoMips).
            Note 96.31% of all statistics are fiction.



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: nmi errors?
  2003-09-03 21:34   ` Robert L. Harris
  2003-09-04 12:18     ` Richard B. Johnson
@ 2003-09-04 15:11     ` Martin Schlemmer
  2003-09-04 15:25       ` Robert L. Harris
  1 sibling, 1 reply; 6+ messages in thread
From: Martin Schlemmer @ 2003-09-04 15:11 UTC (permalink / raw)
  To: Robert L. Harris; +Cc: Richard B. Johnson, Linux-Kernel

On Wed, 2003-09-03 at 23:34, Robert L. Harris wrote:
> We ran "memtest" on the machine over the weekend and it completed 3
> times without any problems.  Know a better or different test?
> 

You might try to enable all the tests, addresses and set the
cache to be always on in memtest.  Typical keys pressed is:

  c - 1 - 2 - 2 - 3 - 3 - 3

Another is goldmemory, which is fairly the same in default setup
as memtest with above config, but shareware, not gpl.

> 
> Thus spake Richard B. Johnson (root@chaos.analogic.com):
> 
> > On Wed, 3 Sep 2003, Robert L. Harris wrote:
> > 
> > >
> > >
> > > Can anyone tell me what this is?
> > >
> > > 16:00:09 mailserver kernel: Uhhuh. NMI received for unknown reason 31.
> > > 16:00:09 mailserver kernel: Dazed and confused, but trying to continue
> > > 16:00:09 mailserver kernel: Do you have a strange power saving mode enabled?
> > > 16:00:34 mailserver kernel: Uhhuh. NMI received for unknown reason 21.
> > > 16:00:34 mailserver kernel: Dazed and confused, but trying to continue
> > >
> > > A coworker put a script on a server which loads up quite afew arrays
> > > with pre-set values and then compares the values against arrays.  As soon as he
> > > kicked off the script I got alot of these in my log files.  Not much longer and the
> > > machine crashed hard.
> > >
> > 
> > Possible bad RAM.
> > 
> > Cheers,
> > Dick Johnson
> > Penguin : Linux version 2.4.22 on an i686 machine (794.73 BogoMips).
> >             Note 96.31% of all statistics are fiction.
> > 
> 
> :wq!
> ---------------------------------------------------------------------------
> Robert L. Harris                     | GPG Key ID: E344DA3B
>                                          @ x-hkp://pgp.mit.edu
> DISCLAIMER:
>       These are MY OPINIONS ALONE.  I speak for no-one else.
> 
> Life is not a destination, it's a journey.
>   Microsoft produces 15 car pileups on the highway.
>     Don't stop traffic to stand and gawk at the tragedy.
-- 
Martin Schlemmer



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: nmi errors?
  2003-09-04 15:11     ` Martin Schlemmer
@ 2003-09-04 15:25       ` Robert L. Harris
  0 siblings, 0 replies; 6+ messages in thread
From: Robert L. Harris @ 2003-09-04 15:25 UTC (permalink / raw)
  To: Martin Schlemmer; +Cc: Richard B. Johnson, Linux-Kernel

[-- Attachment #1: Type: text/plain, Size: 3063 bytes --]



I ran some tests Richard gave me which said it wasn't bad ram but a bad
motherboard.  I just upgraded to 2.4.22-bk10 and it ran MUCH better,
able to use all 16Gigs happily for quite some time until a couple of the
processes started finishing then it gave the NMI's en mass, guess it
couldn't shove them off to the log server in time.

I'll try the below just to make sure but this is getting odd.


Thus spake Martin Schlemmer (azarah@gentoo.org):

> On Wed, 2003-09-03 at 23:34, Robert L. Harris wrote:
> > We ran "memtest" on the machine over the weekend and it completed 3
> > times without any problems.  Know a better or different test?
> > 
> 
> You might try to enable all the tests, addresses and set the
> cache to be always on in memtest.  Typical keys pressed is:
> 
>   c - 1 - 2 - 2 - 3 - 3 - 3
> 
> Another is goldmemory, which is fairly the same in default setup
> as memtest with above config, but shareware, not gpl.
> 
> > 
> > Thus spake Richard B. Johnson (root@chaos.analogic.com):
> > 
> > > On Wed, 3 Sep 2003, Robert L. Harris wrote:
> > > 
> > > >
> > > >
> > > > Can anyone tell me what this is?
> > > >
> > > > 16:00:09 mailserver kernel: Uhhuh. NMI received for unknown reason 31.
> > > > 16:00:09 mailserver kernel: Dazed and confused, but trying to continue
> > > > 16:00:09 mailserver kernel: Do you have a strange power saving mode enabled?
> > > > 16:00:34 mailserver kernel: Uhhuh. NMI received for unknown reason 21.
> > > > 16:00:34 mailserver kernel: Dazed and confused, but trying to continue
> > > >
> > > > A coworker put a script on a server which loads up quite afew arrays
> > > > with pre-set values and then compares the values against arrays.  As soon as he
> > > > kicked off the script I got alot of these in my log files.  Not much longer and the
> > > > machine crashed hard.
> > > >
> > > 
> > > Possible bad RAM.
> > > 
> > > Cheers,
> > > Dick Johnson
> > > Penguin : Linux version 2.4.22 on an i686 machine (794.73 BogoMips).
> > >             Note 96.31% of all statistics are fiction.
> > > 
> > 
> > :wq!
> > ---------------------------------------------------------------------------
> > Robert L. Harris                     | GPG Key ID: E344DA3B
> >                                          @ x-hkp://pgp.mit.edu
> > DISCLAIMER:
> >       These are MY OPINIONS ALONE.  I speak for no-one else.
> > 
> > Life is not a destination, it's a journey.
> >   Microsoft produces 15 car pileups on the highway.
> >     Don't stop traffic to stand and gawk at the tragedy.
> -- 
> Martin Schlemmer
> 

:wq!
---------------------------------------------------------------------------
Robert L. Harris                     | GPG Key ID: E344DA3B
                                         @ x-hkp://pgp.mit.edu
DISCLAIMER:
      These are MY OPINIONS ALONE.  I speak for no-one else.

Life is not a destination, it's a journey.
  Microsoft produces 15 car pileups on the highway.
    Don't stop traffic to stand and gawk at the tragedy.

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2003-09-04 15:26 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-09-03 21:20 nmi errors? Robert L. Harris
2003-09-03 21:25 ` Richard B. Johnson
2003-09-03 21:34   ` Robert L. Harris
2003-09-04 12:18     ` Richard B. Johnson
2003-09-04 15:11     ` Martin Schlemmer
2003-09-04 15:25       ` Robert L. Harris

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).