linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: nforce2 lockups
@ 2003-08-23 12:20 Mikael Pettersson
  2003-08-23 12:48 ` Patrick Dreker
  0 siblings, 1 reply; 20+ messages in thread
From: Mikael Pettersson @ 2003-08-23 12:20 UTC (permalink / raw)
  To: kenton.groombridge, patrick; +Cc: linux-kernel

On Sat, 23 Aug 2003 10:41:46 +0900, kenton.groombridge@us.army.mil wrote:
>I had done this before but without the nolapic option.  That appears to have been the solution.  Ran a whole day without one lockup where before 10 minutes was rarely achieved.
...
>It looks like the nolapic kernel parameter was just recently introduced.  I tried it in both the 2.4.22-rc2 kernel and the 2.6.0-test3 kernel with success in both.
>
>Would still like apic to be completely fixed in the nforce2 chipsets, but I am just happy to have a working system again.

"nolapic" does not exist in standard 2.4.22-rc2 or 2.6.0-test3 kernels.
The patch which added nolapic support was included in 2.6.0-test3-bk8,
and 2.6.0-test<2or3>-mm<something> before that.

Passing nolapic to a kernel which doesn't recognise it causes it to
simply be passed through to init, with no error message.
So either you used non-standard versions of 2.4.22-rc2/2.6.0-test3,
or nolapic wasn't the thing that fixed your nforce2 board.

"noapic" (note: no "l") might very well fix your board, but that's
a completely different animal: it disables the I/O-APIC, which
handles board-level interrupt routing. In a kernel that supports it,
"nolapic" effectively also disables the I/O-APIC.

acpi=off or pci=noacpi might also fix the board, if ACPI is busted.

/Mikael

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: nforce2 lockups
  2003-08-23 12:20 nforce2 lockups Mikael Pettersson
@ 2003-08-23 12:48 ` Patrick Dreker
  0 siblings, 0 replies; 20+ messages in thread
From: Patrick Dreker @ 2003-08-23 12:48 UTC (permalink / raw)
  To: Mikael Pettersson, kenton.groombridge; +Cc: linux-kernel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Am Samstag, 23. August 2003 14:20 schrieb Mikael Pettersson <mikpe@csd.uu.se> 
zum Thema Re: nforce2 lockups:
> On Sat, 23 Aug 2003 10:41:46 +0900, kenton.groombridge@us.army.mil wrote:
> Passing nolapic to a kernel which doesn't recognise it causes it to
> simply be passed through to init, with no error message.
> So either you used non-standard versions of 2.4.22-rc2/2.6.0-test3,
> or nolapic wasn't the thing that fixed your nforce2 board.
It probably was a combination of the other measures mentioned in my mail then. 
I have (had) the same problems, and one has to completely avoid the local 
APIC it seems. Passing noapic and disabling APIC Mode in the BIOS did not do 
that (2.6.0-test3 + acpi20030730). 

> "noapic" (note: no "l") might very well fix your board, but that's
see above. noapic had no effect on the freezes. On boot it still said "found 
and enabling local APIC" and the system freezes within minutes when I push 
data across the network. Just after compiling a kernel with absolutely no 
APIC stuff compiled in, it worked... Note that the nmi_watchdog was not 
triggered by the freezes, and that seems to run via the APIC, too.

> acpi=off or pci=noacpi might also fix the board, if ACPI is busted.
ACPI works fine, when using at least acpi20030730. Without ACPI the interrupts 
are not assigned OK, as e.g. the onboard 3com NIC does not work (at least 
here).

- -- 
Patrick Dreker

GPG KeyID  : 0xFCC2F7A7 (Patrick Dreker)
Fingerprint: 7A21 FC7F 707A C498 F370  1008 7044 66DA FCC2 F7A7
Key available from keyservers or http://www.dreker.de/pubkey.asc
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/R2KMcERm2vzC96cRAstLAJ9ontoUXT7DS3Nij6rvHdEs7lN7kwCfRlFt
9rGbg6ebuFgAVpXReX4MXuY=
=u10b
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: nforce2 lockups
  2003-08-23 15:50 Mikael Pettersson
@ 2003-08-23 18:52 ` Patrick Dreker
  0 siblings, 0 replies; 20+ messages in thread
From: Patrick Dreker @ 2003-08-23 18:52 UTC (permalink / raw)
  To: Mikael Pettersson; +Cc: kenton.groombridge, linux-kernel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Am Samstag, 23. August 2003 17:50 schrieb Mikael Pettersson <mikpe@csd.uu.se> 
zum Thema Re: nforce2 lockups:
> On Sat, 23 Aug 2003 14:48:06 +0200, Patrick Dreker <patrick@dreker.de> 
> > >Am Samstag, 23. August 2003 14:20 schrieb Mikael Pettersson 
> >zum Thema Re: nforce2 lockups:
> >> On Sat, 23 Aug 2003 10:41:46 +0900, kenton.groombridge@us.army.mil
> >> "noapic" (note: no "l") might very well fix your board, but that's
> >see above. noapic had no effect on the freezes. On boot it still said
> > "found and enabling local APIC"
> Of course it did. "noapic" and BIOS APIC mode relate to the I/O-APIC,
> not the local APIC.
Well it's my first board which has an APIC, so I'm still learning how that all 
belongs together...

> My guess is that your BIOS or graphics card can't handle the local
> APIC, presumably due to a crap SMM# handler or you using APM's
> CPU_IDLE or DISPLAY_BLANK options.
Someone else already hinted, that it might be the BIOS/SMM problem. APM is 
completely disabled
> Kernel 2.6.0-test4 supports "nolapic", so that's an option too.
> I'll send that patch to Marcelo for 2.4.23-pre so it should be
> in 2.4 eventually.
I'll try that, and Asus have updated their BIOS for the Board, so I'll give 
that a shot, too.
- -- 
Patrick Dreker

GPG KeyID  : 0xFCC2F7A7 (Patrick Dreker)
Fingerprint: 7A21 FC7F 707A C498 F370  1008 7044 66DA FCC2 F7A7
Key available from keyservers or http://www.dreker.de/pubkey.asc
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/R7f+cERm2vzC96cRAvwXAJ9M49KQUWCI37LJXAUamJiM0628lwCfXPjO
eapurx/Indgk60tVCJ96ztc=
=bebd
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: nforce2 lockups
@ 2003-08-23 15:50 Mikael Pettersson
  2003-08-23 18:52 ` Patrick Dreker
  0 siblings, 1 reply; 20+ messages in thread
From: Mikael Pettersson @ 2003-08-23 15:50 UTC (permalink / raw)
  To: patrick; +Cc: kenton.groombridge, linux-kernel

On Sat, 23 Aug 2003 14:48:06 +0200, Patrick Dreker <patrick@dreker.de> wrote:
>Am Samstag, 23. August 2003 14:20 schrieb Mikael Pettersson <mikpe@csd.uu.se> 
>zum Thema Re: nforce2 lockups:
>> On Sat, 23 Aug 2003 10:41:46 +0900, kenton.groombridge@us.army.mil wrote:
>> Passing nolapic to a kernel which doesn't recognise it causes it to
>> simply be passed through to init, with no error message.
>> So either you used non-standard versions of 2.4.22-rc2/2.6.0-test3,
>> or nolapic wasn't the thing that fixed your nforce2 board.
>It probably was a combination of the other measures mentioned in my mail then. 
>I have (had) the same problems, and one has to completely avoid the local 
>APIC it seems. Passing noapic and disabling APIC Mode in the BIOS did not do 
>that (2.6.0-test3 + acpi20030730). 
...
>> "noapic" (note: no "l") might very well fix your board, but that's
>see above. noapic had no effect on the freezes. On boot it still said "found 
>and enabling local APIC"

Of course it did. "noapic" and BIOS APIC mode relate to the I/O-APIC,
not the local APIC.

My guess is that your BIOS or graphics card can't handle the local
APIC, presumably due to a crap SMM# handler or you using APM's
CPU_IDLE or DISPLAY_BLANK options.

So the solution for your board is to forcibly avoid the local APIC.
A DMI blacklist entry would do that, as would configuring the
kernel without local APIC support (!SMP && !UP_APIC).

Kernel 2.6.0-test4 supports "nolapic", so that's an option too.
I'll send that patch to Marcelo for 2.4.23-pre so it should be
in 2.4 eventually.

As for getting the DMI blacklist rule in, you or someone else
with that board has to run dmidecode and prepare & test a patch.

/Mikael

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: nforce2 lockups
@ 2003-08-23  1:41 kenton.groombridge
  0 siblings, 0 replies; 20+ messages in thread
From: kenton.groombridge @ 2003-08-23  1:41 UTC (permalink / raw)
  To: Patrick Dreker; +Cc: linux-kernel

Thank you!!!

I had done this before but without the nolapic option.  That appears to have been the solution.  Ran a whole day without one lockup where before 10 minutes was rarely achieved.

I don't know if you knew I had a $20 reward for the fix, but it looks like you got it.

Going to run a few more days just to be sure, but send me your address and I will send you the $20 after stressing my system a bit more.

It looks like the nolapic kernel parameter was just recently introduced.  I tried it in both the 2.4.22-rc2 kernel and the 2.6.0-test3 kernel with success in both.

Would still like apic to be completely fixed in the nforce2 chipsets, but I am just happy to have a working system again.

Thanks again,
Ken

----- Original Message -----
From: Patrick Dreker <patrick@dreker.de>
Date: Thursday, August 21, 2003 8:05 pm
Subject: Re: nforce2 lockups

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Am Thursday 21 August 2003 03:39 schrieb 
> kenton.groombridge@us.army.mil zum 
> Thema Re: nforce2 lockups:
> > and it did cure my spurious interrupt problem, but 
> unfortunately, my
> > lockups have returned.
> I managed to stabilize my Board, but I don't think the trick was 
> obvious: 
> Disable alle APIC related kernel Options (Local APIC and IO-APIC), 
> disable 
> APIC Mode in the BIOS. Check on reboot if it still talks about the 
> APIC in 
> the boot messages (How? IIRC mine did, which was why I did not 
> think that 
> disabling the APIC helped... Actually somehow it still was 
> activated. Could 
> ACPI be part of this?). If it does try noapic and/or nolapic boot 
> options.
> If you completely shut off the APIC it runs stable, but 1 of the 3 
> USB 
> Controllers is not assigned an interrupt. All this with ACPI 
> enabled (ACPI 
> patch 20030730 and kernel 2.6.0-test3).
> 
> - -- 
> Patrick Dreker
> 
> GPG KeyID  : 0xFCC2F7A7 (Patrick Dreker)
> Fingerprint: 7A21 FC7F 707A C498 F370  1008 7044 66DA FCC2 F7A7
> Key available from keyservers or http://www.dreker.de/pubkey.asc
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.2.2 (GNU/Linux)
> 
> iD8DBQE/RKdhcERm2vzC96cRAtjAAJ4y5oOm7uhtPqWtaS/S+mnWTr9C5gCdF3hK
> 2JQZ86psKDmWO74wxrINSRE=
> =YbYq
> -----END PGP SIGNATURE-----
> 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: nforce2 lockups
  2003-08-21  1:39 kenton.groombridge
@ 2003-08-21 11:05 ` Patrick Dreker
  0 siblings, 0 replies; 20+ messages in thread
From: Patrick Dreker @ 2003-08-21 11:05 UTC (permalink / raw)
  To: kenton.groombridge, linux-kernel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Am Thursday 21 August 2003 03:39 schrieb kenton.groombridge@us.army.mil zum 
Thema Re: nforce2 lockups:
> and it did cure my spurious interrupt problem, but unfortunately, my
> lockups have returned.
I managed to stabilize my Board, but I don't think the trick was obvious: 
Disable alle APIC related kernel Options (Local APIC and IO-APIC), disable 
APIC Mode in the BIOS. Check on reboot if it still talks about the APIC in 
the boot messages (How? IIRC mine did, which was why I did not think that 
disabling the APIC helped... Actually somehow it still was activated. Could 
ACPI be part of this?). If it does try noapic and/or nolapic boot options.

If you completely shut off the APIC it runs stable, but 1 of the 3 USB 
Controllers is not assigned an interrupt. All this with ACPI enabled (ACPI 
patch 20030730 and kernel 2.6.0-test3).

- -- 
Patrick Dreker

GPG KeyID  : 0xFCC2F7A7 (Patrick Dreker)
Fingerprint: 7A21 FC7F 707A C498 F370  1008 7044 66DA FCC2 F7A7
Key available from keyservers or http://www.dreker.de/pubkey.asc
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/RKdhcERm2vzC96cRAtjAAJ4y5oOm7uhtPqWtaS/S+mnWTr9C5gCdF3hK
2JQZ86psKDmWO74wxrINSRE=
=YbYq
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: nforce2 lockups
@ 2003-08-21  1:39 kenton.groombridge
  2003-08-21 11:05 ` Patrick Dreker
  0 siblings, 1 reply; 20+ messages in thread
From: kenton.groombridge @ 2003-08-21  1:39 UTC (permalink / raw)
  To: linux-kernel

I downloaded and applied the acpi patch patch_2.4.22-rc2_to_acpi-2.4-20030813.bz2 from:

http://sourceforge.net/project/showfiles.php?group_id=36832

and it did cure my spurious interrupt problem, but unfortunately, my lockups have returned.

It is strange that kernel 2.4.22-rc2 had never locked up at all (only with apic enabled).  Ran it for a good week or so with absolutely no lockup (again, only with apic enabled).  I tried my best to make it lockup and it never did, but the spurious interrupts made any device that loaded with IRQ 16 and above pretty much worthless.  If I disabled apic, my lockups returned.

So there was something about 2.4.22-rc2 (with acpi enabled), that prevented lockups (but had tons os spurious interrupts).  Something in patch_2.4.22-rc2_to_acpi-2.4-20030813.bz2 cured the spurious interrupts, but brought back the lockups.

Ken Groombridge

----- Original Message -----
> > I have ASUS A7N8X Deluxe mobo with nForce2 rev 162 without any 
> problems> (if not counting unability to enabe SiI SATA DMA mode 
> with attached
> > Seagate Barracuda drive).
> 
> I have the exact same Board (except I'm not using SATA), and it's 
> a nightmare. 
> Best uptime so far: a little more than 16 hours. Usually it locks 
> up a lot 
> earlier. When I do network transfers I can cause it to lock within 
> a few 
> minutes. Under "the other OS" it runs without any problems.
> 
> - -- 
> Patrick Dreker


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: nforce2 lockups
  2003-08-15 15:15 ` Clock
  2003-08-15 16:38   ` Alistair J Strachan
@ 2003-08-18 13:28   ` Pavel Machek
  1 sibling, 0 replies; 20+ messages in thread
From: Pavel Machek @ 2003-08-18 13:28 UTC (permalink / raw)
  To: Clock; +Cc: kenton.groombridge, linux-kernel

Hi!

> > I don't think the problem is the the IDE.  I have used a promise controller
> > and disabled the onboard IDE and still had lockups.  If you find a solution,
> > please let me know.  If I find one, I will do likewise.
> 
> It looks like the problem is in APIC. When you disable it, it vanishes.
> And, when you enable NMI watchdog, which is handled by APIC,

Another BIOS that dislikes APIC on when entering SMM mode? Perhaps
that board needs blacklist entry that panics box if APIC is activated?

									Pavel

-- 
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: nforce2 lockups
  2003-08-18  8:04 kenton.groombridge
  2003-08-18  9:31 ` Ookhoi
@ 2003-08-18 11:00 ` Karel Kulhavý
  1 sibling, 0 replies; 20+ messages in thread
From: Karel Kulhavý @ 2003-08-18 11:00 UTC (permalink / raw)
  To: kenton.groombridge; +Cc: linux-kernel

> This is a hunch: is it possible that gcc is compiling something a bit wrong?
> I know that some instructions when processed in a certain order, can do some
> wacky things.  Maybe a gcc bug is causing the Athlon processor to caculate
> some instructions in the right sequence where it sometimes works, and other
> times doesn't.
> 
> The reason I say this, is that I have read a few posts where one person had
> lock-ups with one distro and not the other.  Kernels are pretty much the same
> (I think we are all downloading the latest kernel source and building our own
> kernels), but gcc is different.

I realized that when I recompiled kernel from 2.4.21 to 2.6.0 it could
still be crashed on-demand. But when I replaced the 2.4.21 back it wouldn't
crash. But in meantime, when I replaced the IDE disk for another with the
same kernel, the crash could still be done on-demand.

I tried to copy the swap (disk map: 1G swap @ the beginning, ext2 the rest)
from the crashdisk to noncrashdisk verbatim if it's not dependent on the
content read (the crash was within first 10 seconds, with 40MB/s it's less than
400M from the beginning of the disk) and it didn't help. It seems it is highly
dependent on a sequence of some highly irrelevant operations during the startup
of the kernel.

> 
> Haven't tried it yet, since I am working a project 24/7 that will keep me
> until the end of the month.  Purchased the Athlon XP Gentoo 1.4 CDs, so will
> load then and may get some different results.
> 
> Ken
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: nforce2 lockups
  2003-08-18  8:04 kenton.groombridge
@ 2003-08-18  9:31 ` Ookhoi
  2003-08-18 11:00 ` Karel Kulhavý
  1 sibling, 0 replies; 20+ messages in thread
From: Ookhoi @ 2003-08-18  9:31 UTC (permalink / raw)
  To: kenton.groombridge
  Cc: Patrick Dreker, Jussi Laako, Alistair J Strachan, Clock, linux-kernel

kenton.groombridge@us.army.mil wrote (ao):
> It seems that the kernel recognizes all nforce2 chipsets as revision
> 162. That is my bad since I found that seemed to be a common
> denominator. Taking shots in the dark. :^)

My Shuttle SN41G2 also has "NFORCE2: chipset revision 162"

> I will tell you that I know it isn't related to bad hardware. I used a
> program that I borrowed from my office (not a cheap program, and it is
> thorough).

> The reason I say this, is that I have read a few posts where one
> person had lock-ups with one distro and not the other.  Kernels are
> pretty much the same (I think we are all downloading the latest kernel
> source and building our own kernels), but gcc is different.

I've had a few different 2.5 kernels on this one, compiled under always
up to date debian sid (unstable). The system is (and has been) rock
solid, now running 2.5.70 with an uptime of 70 days. Running
http://www.stanford.edu/group/pandegroup/folding

I get this now and then:
favonius kernel: Bank 2: 940040000000017a

favonius kernel: MCE: The hardware reports a non fatal, correctable
incident occurred on CPU 0.

but this goes unnoticed to me. 

This all with an athlon XP 3000+ btw.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: nforce2 lockups
@ 2003-08-18  8:04 kenton.groombridge
  2003-08-18  9:31 ` Ookhoi
  2003-08-18 11:00 ` Karel Kulhavý
  0 siblings, 2 replies; 20+ messages in thread
From: kenton.groombridge @ 2003-08-18  8:04 UTC (permalink / raw)
  To: Patrick Dreker; +Cc: Jussi Laako, Alistair J Strachan, Clock, linux-kernel

It seems that the kernel recognizes all nforce2 chipsets as revision 162.  That is my bad since I found that seemed to be a common denominator.  Taking shots in the dark. :^)

I will tell you that I know it isn't related to bad hardware.  I used a program that I borrowed from my office (not a cheap program, and it is thorough).  It has loopback plugs for USB, serial, parallel, CDs, etc.  I put in all the loopback plugs and CD, and ran burnin diagnostics nightly for about five days.  It made a total of about 30 complete loops through the hardware.  No problems.  Loaded "the other OS" (I like that) and ran it for three days straight with two different hardware banging programs running and there were no problems.  Ran memtest86 overnight, cycled through the memory a good dozen times, no problems.

I start loading Mandrake 9.1, locks up within the install.  If I am lucky enough to get it installed, it is doomed to lock up in three to five minutes of use.  I have tried every kernel/patch that I know (2.4, 2.6 all with acpi, akmp, pre, bk).  Nothing has helped.  Longest run time, about six hours (because I didn't touch it).

This is a hunch: is it possible that gcc is compiling something a bit wrong?  I know that some instructions when processed in a certain order, can do some wacky things.  Maybe a gcc bug is causing the Athlon processor to caculate some instructions in the right sequence where it sometimes works, and other times doesn't.

The reason I say this, is that I have read a few posts where one person had lock-ups with one distro and not the other.  Kernels are pretty much the same (I think we are all downloading the latest kernel source and building our own kernels), but gcc is different.

Haven't tried it yet, since I am working a project 24/7 that will keep me until the end of the month.  Purchased the Athlon XP Gentoo 1.4 CDs, so will load then and may get some different results.

Ken

----- Original Message -----
From: Patrick Dreker <patrick@dreker.de>
Date: Monday, August 18, 2003 5:02 am
Subject: Re: nforce2 lockups

> >
> > I have ASUS A7N8X Deluxe mobo with nForce2 rev 162 without any 
> problems> (if not counting unability to enabe SiI SATA DMA mode 
> with attached
> > Seagate Barracuda drive).
> 
> I have the exact same Board (except I'm not using SATA), and it's 
> a nightmare. 
> Best uptime so far: a little more than 16 hours. Usually it locks 
> up a lot 
> earlier. When I do network transfers I can cause it to lock within 
> a few 
> minutes. Under "the other OS" it runs without any problems.
> 



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: nforce2 lockups
       [not found] <20030817233306.CC67A2D0074@beton.cybernet.src>
@ 2003-08-17 23:41 ` Karel Kulhavý
  0 siblings, 0 replies; 20+ messages in thread
From: Karel Kulhavý @ 2003-08-17 23:41 UTC (permalink / raw)
  To: linux-kernel

> > Best uptime so far: a little more than 16 hours. Usually it locks up a lot 
> > earlier. When I do network transfers I can cause it to lock within a few 
> > minutes. Under "the other OS" it runs without any problems.
> 
> A friend had lots of problems with his NForce2 mobo until he ran memtest86
> and found that the memory was flaky.  His machine has been running linux
> very well since he replaced the memory.  (Heh, two days ago  ;-)
> 
> I wonder if the NForce2 is a bit fussier about the quality of the memory
> than other chipsets.

I ran memtest overnight and everything was 100% OK.

Cl<


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: nforce2 lockups
  2003-08-17 19:27     ` Jussi Laako
@ 2003-08-17 20:02       ` Patrick Dreker
  0 siblings, 0 replies; 20+ messages in thread
From: Patrick Dreker @ 2003-08-17 20:02 UTC (permalink / raw)
  To: Jussi Laako, Alistair J Strachan; +Cc: Clock, kenton.groombridge, linux-kernel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Am Sunday 17 August 2003 21:27 schrieb Jussi Laako <jussi.laako@pp.inet.fi> 
zum Thema Re: nforce2 lockups:
> On Fri, 2003-08-15 at 19:38, Alistair J Strachan wrote:
> > > NFORCE2: chipset revision 162
> > I use APIC and ACPI on my EPoX 8RDA+, and I've never had any IO problems.
> > So it seems unlikely that it is tied to a chipset revision.
>
> I have ASUS A7N8X Deluxe mobo with nForce2 rev 162 without any problems
> (if not counting unability to enabe SiI SATA DMA mode with attached
> Seagate Barracuda drive).

I have the exact same Board (except I'm not using SATA), and it's a nightmare. 
Best uptime so far: a little more than 16 hours. Usually it locks up a lot 
earlier. When I do network transfers I can cause it to lock within a few 
minutes. Under "the other OS" it runs without any problems.

- -- 
Patrick Dreker

GPG KeyID  : 0xFCC2F7A7 (Patrick Dreker)
Fingerprint: 7A21 FC7F 707A C498 F370  1008 7044 66DA FCC2 F7A7
Key available from keyservers or http://www.dreker.de/pubkey.asc
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/P99ncERm2vzC96cRAkl3AJ9XG9ShZVlQXqyupyhz08EHNdiPiwCgj/ji
W++fbQC3hOVBvR6xCgV7V6A=
=HVPf
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: nforce2 lockups
  2003-08-15 16:38   ` Alistair J Strachan
  2003-08-15 19:06     ` Clock
@ 2003-08-17 19:27     ` Jussi Laako
  2003-08-17 20:02       ` Patrick Dreker
  1 sibling, 1 reply; 20+ messages in thread
From: Jussi Laako @ 2003-08-17 19:27 UTC (permalink / raw)
  To: Alistair J Strachan; +Cc: Clock, kenton.groombridge, linux-kernel

On Fri, 2003-08-15 at 19:38, Alistair J Strachan wrote:

> > > I found your post looking for a solution to my lockups.  I bet if you do
> > > a dmesg, you will find that your nforce2 chipset revision is 162.
> >
> > Yeah! Look:
> >
> > NFORCE2: chipset revision 162
> 
> NFORCE2: chipset revision 162
> 
> I use APIC and ACPI on my EPoX 8RDA+, and I've never had any IO problems. So 
> it seems unlikely that it is tied to a chipset revision.

I have ASUS A7N8X Deluxe mobo with nForce2 rev 162 without any problems
(if not counting unability to enabe SiI SATA DMA mode with attached
Seagate Barracuda drive).

"22:26:25  up 17 days, 11:39,  5 users,  load average: 0.06, 0.02, 0.00"


-- 
Jussi Laako <jussi.laako@pp.inet.fi>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: nforce2 lockups
       [not found] ` <fa.gbe06ic.1ki851c@ifi.uio.no>
@ 2003-08-17 13:00   ` walt
  0 siblings, 0 replies; 20+ messages in thread
From: walt @ 2003-08-17 13:00 UTC (permalink / raw)
  To: Patrick Dreker; +Cc: linux-kernel

Patrick Dreker wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Am Sunday 17 August 2003 21:27 schrieb Jussi Laako <jussi.laako@pp.inet.fi> 
> zum Thema Re: nforce2 lockups:
> 
>>On Fri, 2003-08-15 at 19:38, Alistair J Strachan wrote:
>>
>>>>NFORCE2: chipset revision 162
>>>
>>>I use APIC and ACPI on my EPoX 8RDA+, and I've never had any IO problems.
>>>So it seems unlikely that it is tied to a chipset revision.
>>
>>I have ASUS A7N8X Deluxe mobo with nForce2 rev 162 without any problems
>>(if not counting unability to enabe SiI SATA DMA mode with attached
>>Seagate Barracuda drive).
> 
> 
> I have the exact same Board (except I'm not using SATA), and it's a nightmare. 
> Best uptime so far: a little more than 16 hours. Usually it locks up a lot 
> earlier. When I do network transfers I can cause it to lock within a few 
> minutes. Under "the other OS" it runs without any problems.

A friend had lots of problems with his NForce2 mobo until he ran memtest86
and found that the memory was flaky.  His machine has been running linux
very well since he replaced the memory.  (Heh, two days ago  ;-)

I wonder if the NForce2 is a bit fussier about the quality of the memory
than other chipsets.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: nforce2 lockups
  2003-08-15 17:47       ` Alistair J Strachan
@ 2003-08-15 21:15         ` Clock
  0 siblings, 0 replies; 20+ messages in thread
From: Clock @ 2003-08-15 21:15 UTC (permalink / raw)
  To: Alistair J Strachan; +Cc: linux-kernel

On Fri, Aug 15, 2003 at 06:47:20PM +0100, Alistair J Strachan wrote:
> On Friday 15 August 2003 20:06, Clock wrote:
> [SNIP]
> >
> > I have had three boards with nforce2 replaced (all of them Soltek
> > SL75FRN2-L) and all three did the same. However it seemed the frequency of
> > the crashes varies with actual piece of board.
> 
> That's certainly interesting.
> 
> >
> > The crashes aren't in software - bare 'cat /dev/hda > /dev/null' is
> > often to lock up the machine to the point that poweroff fails.
> 
> [root] 06:43 PM [/home/alistair] time cat /dev/discs/disc0/disc > /dev/null
> (I ctrl-C'd here)
> 
> real    1m23.275s
> user    0m0.979s
> sys     0m12.608s
> 
> I don't know how obvious the problem is on your machine, but it's clearly not 
> an issue on this nForce2. When I was referring to software, that included the 
> kernel i.e., I suspect it isn't a design fault.

It seems to occur fairly often just after boot time. When you try later,
you usually fail in an attempt to lockup the machine and have to freshly RESET
(not ctrl-alt-del!) the machine to get the behaviour back.

Cl<

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: nforce2 lockups
  2003-08-15 16:38   ` Alistair J Strachan
@ 2003-08-15 19:06     ` Clock
  2003-08-15 17:47       ` Alistair J Strachan
  2003-08-17 19:27     ` Jussi Laako
  1 sibling, 1 reply; 20+ messages in thread
From: Clock @ 2003-08-15 19:06 UTC (permalink / raw)
  To: Alistair J Strachan; +Cc: linux-kernel

On Fri, Aug 15, 2003 at 05:38:08PM +0100, Alistair J Strachan wrote:
> On Friday 15 August 2003 16:15, Clock wrote:
> > On Fri, Aug 15, 2003 at 09:12:17PM +0900, kenton.groombridge@us.army.mil 
> wrote:
> > > Hi,
> > >
> > > I found your post looking for a solution to my lockups.  I bet if you do
> > > a dmesg, you will find that your nforce2 chipset revision is 162.
> >
> > Yeah! Look:
> >
> > NFORCE2: chipset revision 162
> 
> [alistair] 05:37 PM [~] dmesg | grep "NFORCE2: chipset"
> NFORCE2: chipset revision 162
> 
> A quick google for "NFORCE2: chipset revision" reveals no chipset revision 
> dmesg except 162. It seems likely most manufactures are using the same 
> revision.
> 
> I use APIC and ACPI on my EPoX 8RDA+, and I've never had any IO problems. So 
> it seems unlikely that it is tied to a chipset revision.

I have had three boards with nforce2 replaced (all of them Soltek SL75FRN2-L)
and all three did the same. However it seemed the frequency of the crashes
varies with actual piece of board.

The crashes aren't in software - bare 'cat /dev/hda > /dev/null' is
often to lock up the machine to the point that poweroff fails.

Cl<

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: nforce2 lockups
  2003-08-15 19:06     ` Clock
@ 2003-08-15 17:47       ` Alistair J Strachan
  2003-08-15 21:15         ` Clock
  0 siblings, 1 reply; 20+ messages in thread
From: Alistair J Strachan @ 2003-08-15 17:47 UTC (permalink / raw)
  To: Clock; +Cc: linux-kernel

On Friday 15 August 2003 20:06, Clock wrote:
[SNIP]
>
> I have had three boards with nforce2 replaced (all of them Soltek
> SL75FRN2-L) and all three did the same. However it seemed the frequency of
> the crashes varies with actual piece of board.

That's certainly interesting.

>
> The crashes aren't in software - bare 'cat /dev/hda > /dev/null' is
> often to lock up the machine to the point that poweroff fails.

[root] 06:43 PM [/home/alistair] time cat /dev/discs/disc0/disc > /dev/null
(I ctrl-C'd here)

real    1m23.275s
user    0m0.979s
sys     0m12.608s

I don't know how obvious the problem is on your machine, but it's clearly not 
an issue on this nForce2. When I was referring to software, that included the 
kernel i.e., I suspect it isn't a design fault.

Any other details?

Cheers,
Alistair.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: nforce2 lockups
  2003-08-15 15:15 ` Clock
@ 2003-08-15 16:38   ` Alistair J Strachan
  2003-08-15 19:06     ` Clock
  2003-08-17 19:27     ` Jussi Laako
  2003-08-18 13:28   ` Pavel Machek
  1 sibling, 2 replies; 20+ messages in thread
From: Alistair J Strachan @ 2003-08-15 16:38 UTC (permalink / raw)
  To: Clock, kenton.groombridge; +Cc: linux-kernel

On Friday 15 August 2003 16:15, Clock wrote:
> On Fri, Aug 15, 2003 at 09:12:17PM +0900, kenton.groombridge@us.army.mil 
wrote:
> > Hi,
> >
> > I found your post looking for a solution to my lockups.  I bet if you do
> > a dmesg, you will find that your nforce2 chipset revision is 162.
>
> Yeah! Look:
>
> NFORCE2: chipset revision 162

[alistair] 05:37 PM [~] dmesg | grep "NFORCE2: chipset"
NFORCE2: chipset revision 162

A quick google for "NFORCE2: chipset revision" reveals no chipset revision 
dmesg except 162. It seems likely most manufactures are using the same 
revision.

I use APIC and ACPI on my EPoX 8RDA+, and I've never had any IO problems. So 
it seems unlikely that it is tied to a chipset revision.

[snip]
>
> It looks like the problem is in APIC. When you disable it, it vanishes.
> And, when you enable NMI watchdog, which is handled by APIC,
> it doesn't work - it couts up to 15 in /proc/interrupts and then stops!

I have not noticed any such APIC issues.

[alistair] 05:36 PM [~] uname -r
2.6.0-test3-mm2

[alistair] 05:37 PM [~] cat /proc/interrupts
           CPU0
  0:    4582940          XT-PIC  timer
  1:      22830    IO-APIC-edge  i8042
  2:          0          XT-PIC  cascade
  4:     340689    IO-APIC-edge  serial
  7:       4881    IO-APIC-edge  parport0
  8:          1    IO-APIC-edge  rtc
  9:          0   IO-APIC-level  acpi
 14:      12942    IO-APIC-edge  ide0
 15:         10    IO-APIC-edge  ide1
 16:          4   IO-APIC-level  bttv0
 19:     504114   IO-APIC-level  EMU10K1, nvidia
 20:      45043   IO-APIC-level  ohci-hcd
 21:          0   IO-APIC-level  ehci_hcd
 22:         82   IO-APIC-level  ohci-hcd
NMI:          0
LOC:    4582946
ERR:          0
MIS:          0

Sounds suspiciously like software to me.

Cheers,
Alistair.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: nforce2 lockups
       [not found] <df962fdf9006.df9006df962f@us.army.mil>
@ 2003-08-15 15:15 ` Clock
  2003-08-15 16:38   ` Alistair J Strachan
  2003-08-18 13:28   ` Pavel Machek
  0 siblings, 2 replies; 20+ messages in thread
From: Clock @ 2003-08-15 15:15 UTC (permalink / raw)
  To: kenton.groombridge; +Cc: linux-kernel

On Fri, Aug 15, 2003 at 09:12:17PM +0900, kenton.groombridge@us.army.mil wrote:
> Hi,
> 
> I found your post looking for a solution to my lockups.  I bet if you do a dmesg, you will find that your nforce2 chipset revision is 162.

Yeah! Look:

NFORCE2: chipset revision 162

:)

> 
> I have found tons of people with this exact problem.  My Abit board will run
> Windows 2000 flawlessly, but lockup in a minute under Linux.

> 
> Currently I have a reward of $20 posted on two lists looking for a solution.  Currently looking to up the ante to $40.
> 
> http://www.nvnews.net/vbulletin/showthread.php?s=&threadid=16264
> 
> http://www.nforcershq.com/forum/viewtopic.php?t=27003
> 
> I don't think the problem is the the IDE.  I have used a promise controller
> and disabled the onboard IDE and still had lockups.  If you find a solution,
> please let me know.  If I find one, I will do likewise.

It looks like the problem is in APIC. When you disable it, it vanishes.
And, when you enable NMI watchdog, which is handled by APIC,
it doesn't work - it couts up to 15 in /proc/interrupts and then stops!

Cl<

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2003-08-23 18:52 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-08-23 12:20 nforce2 lockups Mikael Pettersson
2003-08-23 12:48 ` Patrick Dreker
  -- strict thread matches above, loose matches on Subject: below --
2003-08-23 15:50 Mikael Pettersson
2003-08-23 18:52 ` Patrick Dreker
2003-08-23  1:41 kenton.groombridge
2003-08-21  1:39 kenton.groombridge
2003-08-21 11:05 ` Patrick Dreker
2003-08-18  8:04 kenton.groombridge
2003-08-18  9:31 ` Ookhoi
2003-08-18 11:00 ` Karel Kulhavý
     [not found] <20030817233306.CC67A2D0074@beton.cybernet.src>
2003-08-17 23:41 ` Karel Kulhavý
     [not found] <fa.ih2vscq.35m1rs@ifi.uio.no>
     [not found] ` <fa.gbe06ic.1ki851c@ifi.uio.no>
2003-08-17 13:00   ` walt
     [not found] <df962fdf9006.df9006df962f@us.army.mil>
2003-08-15 15:15 ` Clock
2003-08-15 16:38   ` Alistair J Strachan
2003-08-15 19:06     ` Clock
2003-08-15 17:47       ` Alistair J Strachan
2003-08-15 21:15         ` Clock
2003-08-17 19:27     ` Jussi Laako
2003-08-17 20:02       ` Patrick Dreker
2003-08-18 13:28   ` Pavel Machek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).