From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S262288AbTLDPTY (ORCPT ); Thu, 4 Dec 2003 10:19:24 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S262303AbTLDPTY (ORCPT ); Thu, 4 Dec 2003 10:19:24 -0500 Received: from legolas.restena.lu ([158.64.1.34]:26856 "EHLO smtp.restena.lu") by vger.kernel.org with ESMTP id S262288AbTLDPTO (ORCPT ); Thu, 4 Dec 2003 10:19:14 -0500 Subject: RE: NForce2 pseudoscience stability testing (2.6.0-test11) From: Craig Bradney To: b@netzentry.com Cc: dan@dcrdev.demon.co.uk, linux-kernel@vger.kernel.org In-Reply-To: <3FCF25F2.6060008@netzentry.com> References: <3FCF25F2.6060008@netzentry.com> Content-Type: text/plain Message-Id: <1070551149.4063.8.camel@athlonxp.bradney.info> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.5 Date: Thu, 04 Dec 2003 16:19:10 +0100 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org As reported earlier today I had the first lockup this morning in over 5 days uptime. Having had that, I decided to go for the latest gentoo 2.6 test 11-r1 kernel. This means I was now running the following patches: http://dev.gentoo.org/~brad_mssw/kernel_patches/2.6.0/genpatches-0.7/ instead of http://dev.gentoo.org/~brad_mssw/kernel_patches/2.6.0/genpatches-0.6/ on top of vanilla test 11. Anyway.. my two changes on rebuilding the kernel were initially: -Add in preempt (because there were questions asked here) -Remove generic IDE support (I know I have Nvidia IDE so lets only have that one). In that configuration the "multiple hdparm -t /dev/hda" test hung my system. Rebuilt kernel without preempt.. no hang on hdparm test. So in summary, apart from the patch changes as above, the only difference is to my 5 day kernel I now dont have generic IDE support included. So now, I'll leave it on and see how far it goes.. 13 mins so far :). regards Craig On Thu, 2003-12-04 at 13:17, b@netzentry.com wrote: > Dan Creswell wrote: > > Thanks for the input, I'll pass it to the list. > > Most of these Athlon victims are UP users, in fact, I > believe they are exclusively UP. Does MPS 1.1/1.4 play a role > in a UP system ever? I dont think the NForce2 chipset, > where we are seeing these hard hangs (no ping, no screen, > no blinking cursor, no toggling the caps lock, nothing) is > capable of SMP operation. > > Now whats interesting is you finger the IDE as a potential > culprit and think its very low level. Interesting. > > By the way, I've had trouble with SMP on a Tyan board with an > i840 chipset with Linux before - I was never able to resolve > the issue and had to return the board. > > I've beaten on an Intel SR1300 and SR2300 dual Xeon (aka > Micron's Netframe 1610/2610 aka Sun 60x / 65x) and never run > into these hangs with kernels up to 2.4.22. The motherboard > is an Intel SE7501WV2 . > > > >Hi, > >Been following this thread silently for a while and thought > >I'd drop you > >a line as I have some other data you may find useful. > > > >My machine is a dual Xeon with 2Gb, E1000 NIC, MPT LSI SCSI > >disks and an > >IDE CDROM. > > > >2.6-test9 is only stable on this machine with noapic passed in the > >kernel parameters - otherwise, it lock's up in no time flat. > >I can also > >run this kernel in single-processor mode with the APIC enabled and > >that's stable. > > > >2.4.23-rc2 runs fine on the same machine with the APIC enabled > >in SMP mode. > > > >2.4.23-rc5 locks up on this machine if I use the same .config as > for > >-rc2. However, if I disable ACPI and pass "pci=noacpi" to the > kernel, > >this too runs fine. > > - Seems like the ACPI changes in -rc3 are a problem for my > machine. > > > >All of these behaviors have been observed with MPS 1.4 (I've > changed > >that BIOS setting to 1.1 today in preparation for more testing of > the > >above to see if that makes a difference). > > > >I mention all of this because none of my lock-ups have happened > whilst > >accessing the IDE subsystem. I *have* had lockups with > simultaneous > >network and disk access and I've also seen it with simply > >mouse-waggling. I suspect that the problem is *very* low-level > and > >likely related to the interrupt load. In my case, the problems > only > >seem to occur with SMP configurations which makes me suspect there > may > >be a locking/simultaneous update problem. > > > >Oh, forgot to say, my motherboard is a Tyan Thunder S2665 > >(based on the > >intel E7505 chipset). > > > >Hope that helps, > > > >Dan. > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ >