* Re: Catching NForce2 lockup with NMI watchdog - found?
@ 2003-12-08 3:21 Ross Dickson
2003-12-08 11:36 ` Craig Bradney
0 siblings, 1 reply; 8+ messages in thread
From: Ross Dickson @ 2003-12-08 3:21 UTC (permalink / raw)
To: linux-kernel; +Cc: ross, recbo, B.Zolnierkiewicz
On Monday 08 of December 2003 04:08, Bob wrote:
> >>Sounds great.. maybe you have come across something. Yes, the CPU
> >>Disconnect function arrived in your BIOS in revision of 2003/03/27
> >>"6.Adds"CPU Disconnect Function" to adjust C1 disconnects. The Chipset
> >>does not support C2 disconnect; thus, disable C2 function."
> >>
> >>For me though.. Im on an ASUS A7N8X Deluxe v2 BIOS 1007. From what I can
> >>see the CPU Disconnect isnt even in the Uber BIOS 1007 for this ASUS
> >>that has been discussed.
> >>
> >>Craig
> >
> >I don't have that in MSI K7N2 MCP2-T near the
> >agp and fsb spread spectrum items or anywhere
>> else.
>Use athcool:
> http://members.jcom.home.ne.jp/jacobi/linux/softwares.html#athcool
> or apply kernel patch (2.4 and 2.6 versions were posted already).
>--bart
Please take a look at
Fixes for nforce2 hard lockup, apic, io-apic, udma133 covered
in mailing list.
I approached it from another angle regarding delaying the apic ack in local timer irq
and achieved stability. It would be good to have others try it. Ian Kumlien is also
reporting success so far.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog - found?
2003-12-08 3:21 Catching NForce2 lockup with NMI watchdog - found? Ross Dickson
@ 2003-12-08 11:36 ` Craig Bradney
2003-12-08 13:34 ` Ross Dickson
2003-12-08 17:40 ` Bob
0 siblings, 2 replies; 8+ messages in thread
From: Craig Bradney @ 2003-12-08 11:36 UTC (permalink / raw)
To: ross; +Cc: linux-kernel, recbo, B.Zolnierkiewicz
On Mon, 2003-12-08 at 04:21, Ross Dickson wrote:
> On Monday 08 of December 2003 04:08, Bob wrote:
> > >>Sounds great.. maybe you have come across something. Yes, the CPU
> > >>Disconnect function arrived in your BIOS in revision of 2003/03/27
> > >>"6.Adds"CPU Disconnect Function" to adjust C1 disconnects. The Chipset
> > >>does not support C2 disconnect; thus, disable C2 function."
> > >>
> > >>For me though.. Im on an ASUS A7N8X Deluxe v2 BIOS 1007. From what I can
> > >>see the CPU Disconnect isnt even in the Uber BIOS 1007 for this ASUS
> > >>that has been discussed.
> > >>
> > >>Craig
> > >
> > >I don't have that in MSI K7N2 MCP2-T near the
> > >agp and fsb spread spectrum items or anywhere
> >> else.
> >Use athcool:
> > http://members.jcom.home.ne.jp/jacobi/linux/softwares.html#athcool
> > or apply kernel patch (2.4 and 2.6 versions were posted already).
> >--bart
>
> Please take a look at
>
> Fixes for nforce2 hard lockup, apic, io-apic, udma133 covered
>
> in mailing list.
>
> I approached it from another angle regarding delaying the apic ack in local timer irq
> and achieved stability. It would be good to have others try it. Ian Kumlien is also
> reporting success so far.
>
Although I had long uptimes before.. and therefore might achieve them
again fairly easily.. I'm now on 2 days 10 hours which has included a
lot of compilation and a lot of idle time, and plenty of the hdpar and
grep tests. I have used only the IRQ0 IO-APIC edge patch.
Can someone please note all the patches for 2.6 that people have tried
and what they achieve? Im starting to get a bit lost, given the fact
that I'm running stable here with only 1 patch. (so far - this is where
it crashes after I click Send I suppose ;) )
-apic
-io-apic (IRQO set to XT-PIC incorrectly)
-udma133?
-cpu disconnect patch (missing bios option for ACPI Cx states)
Craig
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog - found?
2003-12-08 11:36 ` Craig Bradney
@ 2003-12-08 13:34 ` Ross Dickson
2003-12-08 17:40 ` Bob
1 sibling, 0 replies; 8+ messages in thread
From: Ross Dickson @ 2003-12-08 13:34 UTC (permalink / raw)
To: Craig Bradney; +Cc: linux-kernel, recbo, B.Zolnierkiewicz, AMartin
On Monday 08 December 2003 21:36, Craig Bradney wrote:
> On Mon, 2003-12-08 at 04:21, Ross Dickson wrote:
> > On Monday 08 of December 2003 04:08, Bob wrote:
> > > >>Sounds great.. maybe you have come across something. Yes, the CPU
> > > >>Disconnect function arrived in your BIOS in revision of 2003/03/27
> > > >>"6.Adds"CPU Disconnect Function" to adjust C1 disconnects. The Chipset
> > > >>does not support C2 disconnect; thus, disable C2 function."
> > > >>
> > > >>For me though.. Im on an ASUS A7N8X Deluxe v2 BIOS 1007. From what I can
> > > >>see the CPU Disconnect isnt even in the Uber BIOS 1007 for this ASUS
> > > >>that has been discussed.
> > > >>
> > > >>Craig
> > > >
> > > >I don't have that in MSI K7N2 MCP2-T near the
> > > >agp and fsb spread spectrum items or anywhere
> > >> else.
> > >Use athcool:
> > > http://members.jcom.home.ne.jp/jacobi/linux/softwares.html#athcool
> > > or apply kernel patch (2.4 and 2.6 versions were posted already).
> > >--bart
> >
> > Please take a look at
> >
> > Fixes for nforce2 hard lockup, apic, io-apic, udma133 covered
> >
> > in mailing list.
> >
> > I approached it from another angle regarding delaying the apic ack in local timer irq
> > and achieved stability. It would be good to have others try it. Ian Kumlien is also
> > reporting success so far.
> >
>
> Although I had long uptimes before.. and therefore might achieve them
> again fairly easily.. I'm now on 2 days 10 hours which has included a
> lot of compilation and a lot of idle time, and plenty of the hdpar and
> grep tests. I have used only the IRQ0 IO-APIC edge patch.
>
> Can someone please note all the patches for 2.6 that people have tried
> and what they achieve? Im starting to get a bit lost, given the fact
> that I'm running stable here with only 1 patch. (so far - this is where
> it crashes after I click Send I suppose ;) )
>
> -apic
>
> -io-apic (IRQO set to XT-PIC incorrectly)
>
> -udma133?
>
> -cpu disconnect patch (missing bios option for ACPI Cx states)
>
> Craig
>
>
>
>
Hi Craig
Here is a link to my original posting
http://linux.derkeiler.com/Mailing-Lists/Kernel/2003-12/1528.html
I have not been working with 2.6, only 2.4.23 and 2.4.22.
My work has been independent of the cpu disconnect function and mpparse
patches spoken of in this posting thread. I think my work may have utility
within the 2.6 environment hence my posting to this thread.
This followup with Ian may shed some more light.
http://linux.derkeiler.com/Mailing-Lists/Kernel/2003-12/1648.html
My test system has also been running ioapic for days now and I have stress
loaded it at times with gigabyte file copies alongside mplayer playing movies in
three windows at the same time. It feels good to see the hardware performing
well. My motivation was that I have a digital imaging application where one
of the PCI cards performs poorly with shared interrupts so I had to fix the
ioapic mode. At one stage I got desperate and went back to xtpic mode.
I tried reprogramming the pirq routing registers to get extra pci irqs other
than just irq11 from the AMD768 docs but of course the nforce2 is different
and it didn't work.
Referring to my posting the stability key I found is the (a) apic ack delay.
I actually implemented the (b) ioapic timer edge first but my system was
unstable still. Part (c) the udma133 was a cleanup along the way when I
thought the problem had to do with ide interface timings. It is probably not
a required part of the solution but it was there so I thought I would throw
it into the ring, in fact looking back I wonder if part (a) the apic ack delay
is enough on its own to get rid of the lockups?
I do not know why the apic timing delay stablises the system.
Perhaps with the bios an SMI event occurs to do with the cpu disconnect
function? and the apic ack gets lost? I do not know? but hopefully someone
can get to the bottom of it. The concept of it being bios dependent is perhaps
not so strange. I found this posting regarding a bad SMI during my research.
http://www.ussg.iu.edu/hypermail/linux/kernel/0203.2/0698.html
I have really found this flying in the dark with respect to hardware docs
very frustrating. I do not have the appropriate AMD cpu nor the nForce2 docs
to be able to delve much deeper into the cause. Heck I even dragged out my
old Intel "Microsystem Components Handbook Volume 1 1984" to look up detail on
the 8259 auto end of interrupt mode used in the virtual wire mode. I had never seen
anyone use it before.
I hope that whichever solutions best fix the problems end up in the kernel tree
for all our benefit. This has cost me 2 weeks of little sleep to get this far and being
in small business it is also two weeks of unpaid time. I wish the hardware industry
had an open documents philosophy as good as linux has open source.
Regards
Ross.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Catching NForce2 lockup with NMI watchdog - found?
2003-12-08 11:36 ` Craig Bradney
2003-12-08 13:34 ` Ross Dickson
@ 2003-12-08 17:40 ` Bob
2003-12-09 16:44 ` merged in bk5 " Bob
1 sibling, 1 reply; 8+ messages in thread
From: Bob @ 2003-12-08 17:40 UTC (permalink / raw)
To: linux-kernel
We're all trying to get acpi, apic, lapic, io-apic working
when turned on in cmos/bios and kernel.
The three things that each alone have achieved stability
on somebody's system here are 1) bios update 2) cpu
disconnect off either in cmos if available or by athcool
or kernel patch with same 3) timing delay patch
For CPU disconnect you still need athcool or this one
http://www.kernel.org/pub/linux/kernel/people/bart/2.6.0-test11-bart1/broken-out/nforce2-disconnect-quirk.patch
Both patches are for 2.6.0-test11 kernel.
turn on ioapic edge timer--
http://www.kernel.org/pub/linux/kernel/people/bart/2.6.0-test11-bart1/broken-out/nforce2-apic.patch
Other changes offer clues and expose symptoms but
are not helpful or necessary after da fix is in. udma
settings are a kludge of a error symptom, just aspirin.
Other kludges are acpi off, local apic off in kernel,
apic off in cmos/bios. These go away when the real
problem is fixed.
-Bob
Craig Bradney wrote:
>On Mon, 2003-12-08 at 04:21, Ross Dickson wrote:
>
>
>>On Monday 08 of December 2003 04:08, Bob wrote:
>> > >>Sounds great.. maybe you have come across something. Yes, the CPU
>> > >>Disconnect function arrived in your BIOS in revision of 2003/03/27
>> > >>"6.Adds"CPU Disconnect Function" to adjust C1 disconnects. The Chipset
>> > >>does not support C2 disconnect; thus, disable C2 function."
>> > >>
>> > >>For me though.. Im on an ASUS A7N8X Deluxe v2 BIOS 1007. From what I can
>> > >>see the CPU Disconnect isnt even in the Uber BIOS 1007 for this ASUS
>> > >>that has been discussed.
>> > >>
>> > >>Craig
>> > >
>> > >I don't have that in MSI K7N2 MCP2-T near the
>> > >agp and fsb spread spectrum items or anywhere
>> >> else.
>>
>>
>>>Use athcool:
>>> http://members.jcom.home.ne.jp/jacobi/linux/softwares.html#athcool
>>>or apply kernel patch (2.4 and 2.6 versions were posted already).
>>>--bart
>>>
>>>
>>Please take a look at
>>
>>Fixes for nforce2 hard lockup, apic, io-apic, udma133 covered
>>
>>in mailing list.
>>
>>I approached it from another angle regarding delaying the apic ack in local timer irq
>>and achieved stability. It would be good to have others try it. Ian Kumlien is also
>>reporting success so far.
>>
>>
>>
>
>Although I had long uptimes before.. and therefore might achieve them
>again fairly easily.. I'm now on 2 days 10 hours which has included a
>lot of compilation and a lot of idle time, and plenty of the hdpar and
>grep tests. I have used only the IRQ0 IO-APIC edge patch.
>
>Can someone please note all the patches for 2.6 that people have tried
>and what they achieve? Im starting to get a bit lost, given the fact
>that I'm running stable here with only 1 patch. (so far - this is where
>it crashes after I click Send I suppose ;) )
>
>-apic
>
>
-local apic("lapic")
-acpi
kernel local apic issue and acpi and apic go together,
but turning off lapic first might achieve stability for
some, updating bios will enable using all bios and
linux apic, acpi, lapic, ioapic for some(me twice).
>-io-apic (IRQO set to XT-PIC incorrectly)
>
>-udma133?
>
>
udma133 may be a clue but I don't think anyone
achieves stability one way or the other on that. I
flogged every possible hdparm change and tried
three brands of hd controller without every
achieving stability, but once you're stable by
using other means you can use 133 and unmask
irq(not for siig sis?) and other hdparm opts.
>-cpu disconnect patch (missing bios option for ACPI Cx states)
>
>Craig
>
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* merged in bk5 Re: Catching NForce2 lockup with NMI watchdog - found?
2003-12-08 17:40 ` Bob
@ 2003-12-09 16:44 ` Bob
2003-12-09 16:43 ` Bartlomiej Zolnierkiewicz
0 siblings, 1 reply; 8+ messages in thread
From: Bob @ 2003-12-09 16:44 UTC (permalink / raw)
To: linux-kernel
if you're following this thread, good news--
nforce2 fixups have been merged in
linux-2.6.0-test11-bk5.patch
> -bk snapshot (patch-2.6.0-test11-bk5)
nforce2-disconnect-quirk.patch
> [x86] fix lockups with APIC support on nForce2
>
>nforce2-apic.patch
> [x86] do not wrongly override mp_ExtINT IRQ
plus promise and sis fixes so I don't need to pay
for a 3ware controller ;-) that was another
show-stopper for me earlier
> We're all trying to get acpi, apic, lapic, io-apic working
> when turned on in cmos/bios and kernel.
>
> The three things that each alone have achieved stability
> on somebody's system here are 1) bios update 2) cpu
> disconnect off either in cmos if available or by athcool
> or kernel patch with same 3) timing delay patch
>
> For CPU disconnect you still need athcool or this one
> http://www.kernel.org/pub/linux/kernel/people/bart/2.6.0-test11-bart1/broken-out/nforce2-disconnect-quirk.patch
>
>
> Both patches are for 2.6.0-test11 kernel.
>
> turn on ioapic edge timer--
>
> http://www.kernel.org/pub/linux/kernel/people/bart/2.6.0-test11-bart1/broken-out/nforce2-apic.patch
>
>
>
> Other changes offer clues and expose symptoms but
> are not helpful or necessary after da fix is in. udma
> settings are a kludge of a error symptom, just aspirin.
>
> Other kludges are acpi off, local apic off in kernel,
> apic off in cmos/bios. These go away when the real
> problem is fixed.
>
> -Bob
>
> Craig Bradney wrote:
>
>> On Mon, 2003-12-08 at 04:21, Ross Dickson wrote:
>>
>>
>>> On Monday 08 of December 2003 04:08, Bob wrote: > >>Sounds great..
>>> maybe you have come across something. Yes, the CPU > >>Disconnect
>>> function arrived in your BIOS in revision of 2003/03/27 >
>>> >>"6.Adds"CPU Disconnect Function" to adjust C1 disconnects. The
>>> Chipset > >>does not support C2 disconnect; thus, disable C2
>>> function." > >> > >>For me though.. Im on an ASUS A7N8X Deluxe v2
>>> BIOS 1007. From what I can > >>see the CPU Disconnect isnt even in
>>> the Uber BIOS 1007 for this ASUS > >>that has been discussed. > >> >
>>> >>Craig > >
>>> > >I don't have that in MSI K7N2 MCP2-T near the > >agp and fsb
>>> spread spectrum items or anywhere >> else.
>>>
>>>> Use athcool:
>>>> http://members.jcom.home.ne.jp/jacobi/linux/softwares.html#athcool
>>>> or apply kernel patch (2.4 and 2.6 versions were posted already).
>>>> --bart
>>>
>>> Please take a look at
>>> Fixes for nforce2 hard lockup, apic, io-apic, udma133 covered
>>>
>>> in mailing list.
>>>
>>> I approached it from another angle regarding delaying the apic ack
>>> in local timer irq
>>> and achieved stability. It would be good to have others try it. Ian
>>> Kumlien is also
>>> reporting success so far.
>>>
>>>
>>
>>
>> Although I had long uptimes before.. and therefore might achieve them
>> again fairly easily.. I'm now on 2 days 10 hours which has included a
>> lot of compilation and a lot of idle time, and plenty of the hdpar and
>> grep tests. I have used only the IRQ0 IO-APIC edge patch.
>>
>> Can someone please note all the patches for 2.6 that people have tried
>> and what they achieve? Im starting to get a bit lost, given the fact
>> that I'm running stable here with only 1 patch. (so far - this is where
>> it crashes after I click Send I suppose ;) )
>>
>> -apic
>>
>>
> -local apic("lapic")
>
> -acpi
>
> kernel local apic issue and acpi and apic go together,
> but turning off lapic first might achieve stability for
> some, updating bios will enable using all bios and
> linux apic, acpi, lapic, ioapic for some(me twice).
>
>> -io-apic (IRQO set to XT-PIC incorrectly)
>>
>> -udma133?
>>
>>
> udma133 may be a clue but I don't think anyone
> achieves stability one way or the other on that. I
> flogged every possible hdparm change and tried
> three brands of hd controller without every
> achieving stability, but once you're stable by
> using other means you can use 133 and unmask
> irq(not for siig sis?) and other hdparm opts.
>
>> -cpu disconnect patch (missing bios option for ACPI Cx states)
>>
>> Craig
>>
>>
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe
> linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: merged in bk5 Re: Catching NForce2 lockup with NMI watchdog - found?
2003-12-09 16:44 ` merged in bk5 " Bob
@ 2003-12-09 16:43 ` Bartlomiej Zolnierkiewicz
0 siblings, 0 replies; 8+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2003-12-09 16:43 UTC (permalink / raw)
To: Bob; +Cc: linux-kernel
On Tuesday 09 of December 2003 17:44, Bob wrote:
> if you're following this thread, good news--
>
> nforce2 fixups have been merged in
No. This just a part of changelog for -bart1 patch. :-)
> linux-2.6.0-test11-bk5.patch
> > -bk snapshot (patch-2.6.0-test11-bk5)
I included latest snapshot cause my patch is for vanilla kernels.
> nforce2-disconnect-quirk.patch
>
> > [x86] fix lockups with APIC support on nForce2
> >
> >nforce2-apic.patch
> > [x86] do not wrongly override mp_ExtINT IRQ
--bart
^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: merged in bk5 Re: Catching NForce2 lockup with NMI watchdog - found?
@ 2003-12-09 22:57 b
2003-12-10 0:51 ` Josh McKinney
0 siblings, 1 reply; 8+ messages in thread
From: b @ 2003-12-09 22:57 UTC (permalink / raw)
To: recbo, linux-kernel
Is this stuff going to be merged into 2.4 soon? I'd like
to try a 2.4.23/24-bk with these patches.
>From: Bob
>Subject: merged in bk5 Re: Catching NForce2 lockup with NMI
>
>if you're following this thread, good news--
>
>nforce2 fixups have been merged in
>linux-2.6.0-test11-bk5.patch
>> -bk snapshot (patch-2.6.0-test11-bk5)
>
>nforce2-disconnect-quirk.patch
>> [x86] fix lockups with APIC support on nForce2
>>
>>nforce2-apic.patch
>> [x86] do not wrongly override mp_ExtINT IRQ
>
>plus promise and sis fixes so I don't need to pay
>for a 3ware controller ;-) that was another
>show-stopper for me earlier
>
>> We're all trying to get acpi, apic, lapic, io-apic working
>> when turned on in cmos/bios and kernel.
>>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: merged in bk5 Re: Catching NForce2 lockup with NMI watchdog - found?
2003-12-09 22:57 b
@ 2003-12-10 0:51 ` Josh McKinney
0 siblings, 0 replies; 8+ messages in thread
From: Josh McKinney @ 2003-12-10 0:51 UTC (permalink / raw)
To: linux-kernel
On approximately Tue, Dec 09, 2003 at 02:57:11PM -0800, b@netzentry.com wrote:
> Is this stuff going to be merged into 2.4 soon? I'd like
> to try a 2.4.23/24-bk with these patches.
>
These patches are extremely simple. If they don't patch cleanly just
edit the files directly. The first one is 2 lines and the second adds
17 lines.
>
> >From: Bob
> >Subject: merged in bk5 Re: Catching NForce2 lockup with NMI
> >
> >if you're following this thread, good news--
> >
> >nforce2 fixups have been merged in
> >linux-2.6.0-test11-bk5.patch
> >> -bk snapshot (patch-2.6.0-test11-bk5)
> >
> >nforce2-disconnect-quirk.patch
> >> [x86] fix lockups with APIC support on nForce2
> >>
> >>nforce2-apic.patch
> >> [x86] do not wrongly override mp_ExtINT IRQ
> >
> >plus promise and sis fixes so I don't need to pay
> >for a 3ware controller ;-) that was another
> >show-stopper for me earlier
> >
> >> We're all trying to get acpi, apic, lapic, io-apic working
> >> when turned on in cmos/bios and kernel.
> >>
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
Josh McKinney | Webmaster: http://joshandangie.org
--------------------------------------------------------------------------
| They that can give up essential liberty
Linux, the choice -o) | to obtain a little temporary safety deserve
of the GNU generation /\ | neither liberty or safety.
_\_v | -Benjamin Franklin
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2003-12-10 0:57 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-12-08 3:21 Catching NForce2 lockup with NMI watchdog - found? Ross Dickson
2003-12-08 11:36 ` Craig Bradney
2003-12-08 13:34 ` Ross Dickson
2003-12-08 17:40 ` Bob
2003-12-09 16:44 ` merged in bk5 " Bob
2003-12-09 16:43 ` Bartlomiej Zolnierkiewicz
2003-12-09 22:57 b
2003-12-10 0:51 ` Josh McKinney
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).